Why use cosine similarity in Word2Vec when its trained using dot-product similarity

Tags:

According to several posts I found on stackoverflow (for instance this Why does word2Vec use cosine similarity?), it's common practice to calculate the cosine similarity between two word vectors after we have trained a word2vec (either CBOW or Skip-gram) model. However, this seems a little odd to me since the model is actually trained with dot-product as a similarity score. One evidence of this is that the norm of the word vectors we get after training are actually meaningful. So why is it that people still use cosine-similarity instead of dot-product when calculating the similarity between two words?

413

asked Jan 28 '19 22:01

Fred Zhang

1 Answers

Cosine similarity and Dot product are both similarity measures but dot product is magnitude sensitive while cosine similarity is not. Depending on the occurance count of a word it might have a large or small dot product with another word. We normally normalize our vector to prevent this effect so all vectors have unit magnitude. But if your particular downstream task requires occurance count as a feature then dot product might be the way to go, but if you do not care about counts then you can simlpy calculate the cosine similarity which will normalize them.

100

answered Sep 19 '22 01:09

shiredude95

Related questions
                            
                                Best way to classify labeled sentences from a set of documents
                            
                                In Latent Semantic Analysis, how do you recombine the decomposed matrices after truncating the singular values?
                            
                                How to modify text that matches a particular regular expression in Python?
                            
                                How to predict correct country name for user provided country name?
                            
                                How can I use Python NLTK to identify collocations among single characters?
                            
                                Improving on the basic, existing GloVe model
                            
                                Natural language command language
                            
                                How to recognize words in text with non-word tokens?
                            
                                Korean, Thai and Indonesian POS tagger
                            
                                Wordnet selectional restrictions in NLTK
                            
                                Natural Language to Sparql
                            
                                TF-IDF Simple Use - NLTK/Scikit Learn
                            
                                Python: using scikit-learn to predict, gives blank predictions
                            
                                Entity Recognition and Sentiment Analysis using NLP
                            
                                Get noun from verb Wordnet
                            
                                Choosing appropriate sense of a word from wordnet
                            
                                SyntaxNet creating tree to root verb
                            
                                Collocations with spaCy
                            
                                Named entity recognition (NER) features
                            
                                Recurrent NNs: what's the point of parameter sharing? Doesn't padding do the trick anyway?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Why use cosine similarity in Word2Vec when its trained using dot-product similarity

Tags:

nlp

cosine-similarity

word-embedding

word2vec

dot-product

Fred Zhang

People also ask

1 Answers

shiredude95

Recent Activity

Donate For Us