Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to find similar words with FastText?

I am playing around with FastText, https://pypi.python.org/pypi/fasttext,which is quite similar to Word2Vec. Since it seems to be a pretty new library with not to many built in functions yet, I was wondering how to extract morphological similar words.

For eg: model.similar_word("dog") -> dogs. But there is no function built-in.

If I type model["dog"]

I only get the vector, that might be used to compare cosine similarity. model.cosine_similarity(model["dog"], model["dogs"]]).

Do I have to make some sort of loop and do cosine_similarity on all possible pairs in a text? That would take time ...!!!

like image 843
Isbister Avatar asked Feb 13 '17 14:02

Isbister


People also ask

Is FastText better than Word2Vec?

Although it takes longer time to train a FastText model (number of n-grams > number of words), it performs better than Word2Vec and allows rare words to be represented appropriately.

Is FastText better than glove?

The biggest benefit of using FastText is that it generate better word embeddings for rare words, or even words not seen during training because the n-gram character vectors are shared with other words. This is something that Word2Vec and GLOVE cannot achieve.

What is FastText format?

FastText is an open-source, free library from Facebook AI Research(FAIR) for learning word embeddings and word classifications. This model allows creating unsupervised learning or supervised learning algorithm for obtaining vector representations for words.


1 Answers

You can install and import gensim library and then use gensim library to extract most similar words from the model that you downloaded from FastText.

Use this:

import gensim
model = gensim.models.KeyedVectors.load_word2vec_format('model.vec')
similar = model.most_similar(positive=['man'],topn=10)

And by topn parameter you get the top 10 most similar words.

like image 68
Md Rashad Al Hasan Rony Avatar answered Sep 24 '22 08:09

Md Rashad Al Hasan Rony