Get most similar words, given the vector of the word (not the word itself)

Tags:

Using the gensim.models.Word2Vec library, you have the possibility to provide a model and a "word" for which you want to find the list of most similar words:

model = gensim.models.Word2Vec.load_word2vec_format(model_file, binary=True) model.most_similar(positive=[WORD], topn=N)

I wonder if there is a possibility to give the system as input the model and a "vector", and ask the system to return the top similar words (which their vectors is very close to the given vector). Something similar to:

model.most_similar(positive=[VECTOR], topn=N)

I need this functionality for a bilingual setting, in which I have 2 models (English and German), as well as some English words for which I need to find their most similar German candidates. What I want to do is to get the vector of each English word from the English model:

model_EN = gensim.models.Word2Vec.load_word2vec_format(model_file_EN, binary=True) vector_w_en=model_EN[WORD_EN]

and then query the German model with these vectors.

model_DE = gensim.models.Word2Vec.load_word2vec_format(model_file_DE, binary=True) model_DE.most_similar(positive=[vector_w_en], topn=N)

I have implemented this in C using the original distance function in the word2vec package. But, now I need it to be in python, in order to be able to integrate it with my other scripts.

Do you know if there is already a method in gensim.models.Word2Vec library or other similar libraries which does this? Do I need to implement it by myself?

295

asked Jun 14 '16 17:06

amin

1 Answers

The method similar_by_vector returns the top-N most similar words by vector:

similar_by_vector(vector, topn=10, restrict_vocab=None)

158

answered Oct 01 '22 10:10

user48135

Related questions
                            
                                pip uninstall: "No files were found to uninstall."
                            
                                The advantage cast pointer to void* when use new
                            
                                what are the meaning of values at proc/[pid]/stat?
                            
                                Routing between modules in Angular 2
                            
                                What do the three dots before a function argument represent?
                            
                                npm WARN [email protected] requires a peer of babel-core@^6.0.0 but none was installed
                            
                                How do I create a reminder using Google Calendar API?
                            
                                Scanning classpath/modulepath in runtime in Java 9
                            
                                Create files / folders on docker-compose build or docker-compose up
                            
                                Wasm access DOM
                            
                                Python creating video from images using opencv
                            
                                Angular 4 ngOnInit not called after router.navigate

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With