I am currently working on word2vec model using gensim in Python, and want to write a function that can help me find the antonyms and synonyms of a given word. For example: antonym("sad")="happy" synonym("upset")="enraged"
Is there a way to do that in word2vec?
let we say there is a word X and its synonym Y. and also have antonym of Y which is Z. then we can say X-Y + Z = antonym of (X) and synonym of(Z). use model.
The word2vec algorithms include skip-gram and CBOW models, using either hierarchical softmax or negative sampling: Tomas Mikolov et al: Efficient Estimation of Word Representations in Vector Space, Tomas Mikolov et al: Distributed Representations of Words and Phrases and their Compositionality.
Word2vec (Skipgram) At a high level Word2Vec is a unsupervised learning algorithm that uses a shallow neural network (with one hidden layer) to learn the vectorial representations of all the unique words/phrases for a given corpus.
It represents words or phrases in vector space with several dimensions. Word embeddings can be generated using various methods like neural networks, co-occurrence matrix, probabilistic models, etc. Word2Vec consists of models for generating word embedding.
In word2vec you can find analogies, the following way
model = gensim.models.Word2Vec.load_word2vec_format('GoogleNews-vectors-negative300.bin', binary=True)
model.most_similar(positive=['good', 'sad'], negative=['bad'])
[(u'wonderful', 0.6414928436279297),
(u'happy', 0.6154338121414185),
(u'great', 0.5803680419921875),
(u'nice', 0.5683973431587219),
(u'saddening', 0.5588893294334412),
(u'bittersweet', 0.5544661283493042),
(u'glad', 0.5512036681175232),
(u'fantastic', 0.5471092462539673),
(u'proud', 0.530515193939209),
(u'saddened', 0.5293528437614441)]
Now using some standard antonyms like (good, bad), (rich, poor), find multiple such lists of nearest antonyms. After that you can use average of vectors of this list.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With