Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Error when loading FastText's french pre-trained model with gensim

I am trying to use the FastText's french pre-trained binary model (downloaded from the official FastText's github page). I need the .bin model and not the .vec word-vectors so as to approximate misspelled and out-of-vocabulary words.

However when I try to load said model, using:

from gensim.models import FastText
model = FastText.load_fasttext_format('french_bin_model_path')

I get the following error:

NotImplementedError: Supervised fastText models are not supported

What is surprising is that it works just fine when I try to load the english binary model.

I am running python 3.6 and gensim 3.5.0.

Any idea as of why it doesn't work with french vectors are welcome!

like image 283
Clara-sininen Avatar asked Jul 23 '18 14:07

Clara-sininen


1 Answers

I ran into the same problem and ended up using Facebook python wrapper for FastText instead of gensim's implementation.

import fastText 
model = fastText.load(path_to_french_bin)

Then you can get word vectors for out-of-vocabulary words like so:

oov_vector = model.get_word_vector(oov_word)

As for why gensim's load_fasttext_format works for the English model and not the French one I don't know!

like image 147
efont Avatar answered Oct 13 '22 00:10

efont