Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the difference between Sentence Encodings and Contextualized Word Embeddings?

I have seen both terms used while reading papers about BERT and ELMo so I wonder if there is a difference between them.

like image 891
Rodrigo Avatar asked Jan 23 '20 11:01

Rodrigo


People also ask

What is contextualized word embedding?

Contextualised words embeddings aim at capturing word semantics in different contexts to address the issue of polysemous and the context-dependent nature of words.

What are the different types of word embeddings?

The word embedding techniques are used to represent words mathematically. One Hot Encoding, TF-IDF, Word2Vec, FastText are frequently used Word Embedding methods. One of these techniques (in some cases several) is preferred and used according to the status, size and purpose of processing the data.

Is BERT a word embedding or sentence embedding?

The BERT base model uses 12 layers of transformer encoders as discussed, and each output per token from each layer of these can be used as a word embedding!.

What is the difference between Word2Vec and TF-IDF?

Some key differences between TF-IDF and word2vec is that TF-IDF is a statistical measure that we can apply to terms in a document and then use that to form a vector whereas word2vec will produce a vector for a term and then more work may need to be done to convert that set of vectors into a singular vector or other ...


1 Answers

  • A contextualized word embeding is a vector representing a word in a special context. The traditional word embeddings such as Word2Vec and GloVe generate one vector for each word, whereas a contextualized word embedding generates a vector for a word depending on the context. Consider the sentences The duck is swimmingand You shall duck when someone shoots at you. With traditional word embeddings, the word vector for duckwould be the same in both sentences, whereas it should be a different one in the contextualized case.
  • While word embeddings encode words into a vector representation, there is also the question on how to represent a whole sentence in a way a computer can easily work with. These sentence encodings can embedd a whole sentence as one vector , doc2vec for example which generate a vector for a sentence. But also BERT generates a representation for the whole sentence, the [CLS]-token.

So in short, a conextualized word embedding represents a word in a context, whereas a sentence encoding represents a whole sentence.

like image 129
chefhose Avatar answered Sep 23 '22 16:09

chefhose