Passing multiple sentences to BERT?

Question

I have a dataset with paragraphs that I need to classify into two classes. These paragraphs are usually 3-5 sentences long. The overwhelming majority of them are less than 500 words long. I would like to make use of BERT to tackle this problem.

I am wondering how I should use BERT to generate vector representations of these paragraphs and especially, whether it is fine to just pass the whole paragraph into BERT?

There have been informative discussions of related problems here and here. These discussions focus on how to use BERT for representing whole documents. In my case the paragraphs are not that long, and indeed could be passed to BERT without exceeding its maximum length of 512. However, BERT was trained on sentences. Sentences are relatively self-contained units of meaning. I wonder if feeding multiple sentences into BERT doesn't conflict fundamentally with what the model was designed to do (although this appears to be done regularly).

cronoik · Accepted Answer

I think your question is based on a misconception. Even though the BERT paper uses the term sentence quite often, it is not referring to a linguistic sentence. The paper defines a sentence as

an arbitrary span of contiguous text, rather than an actual linguistic sentence.

It is therefore completely fine to pass whole paragraphs to BERT and a reason why they can handle those.

Passing multiple sentences to BERT?

Tags:

nlp

text-classification

bert-language-model

huggingface-transformers

jhfodr76

1 Answers

cronoik

Recent Activity

Donate For Us

Passing multiple sentences to BERT?

Tags:

nlp

text-classification

bert-language-model

huggingface-transformers

jhfodr76

1 Answers

cronoik

Related questions

Recent Activity

Donate For Us