Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Error while loading Word2Vec model in gensim

I'm getting an AttributeError while loading the gensim model available at word2vec repository:

from gensim import models
w = models.Word2Vec()
w.load_word2vec_format('GoogleNews-vectors-negative300.bin', binary=True)
print w["queen"]

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-3-8219e36ba1f6> in <module>()
----> 1 w["queen"]

C:\Anaconda64\lib\site-packages\gensim\models\word2vec.pyc in __getitem__(self, word)
    761 
    762         """
--> 763         return self.syn0[self.vocab[word].index]
    764 
    765 

AttributeError: 'Word2Vec' object has no attribute 'syn0'

Is this a known issue ?

like image 930
Tarantula Avatar asked Aug 19 '15 17:08

Tarantula


People also ask

What is Gensim’s word2vec model?

Introduces Gensim’s Word2Vec model and demonstrates its use on the Lee Evaluation Corpus. In case you missed the buzz, Word2Vec is a widely used algorithm based on neural networks, commonly referred to as “deep learning” (though word2vec itself is rather shallow).

How do I load a saved model in Gensim?

# # To load a saved model: # new_model = gensim.models.Word2Vec.load(temporary_filepath) which uses pickle internally, optionally mmap ‘ing the model’s internal large NumPy matrices into virtual memory directly from disk files, for inter-process memory sharing.

How does word2vec skip-gram work?

The Word2Vec Skip-gram model, for example, takes in pairs (word1, word2) generated by moving a window across text data, and trains a 1-hidden-layer neural network based on the synthetic task of given an input word, giving us a predicted probability distribution of nearby words to the input.

What are the possible errors with word2veckeyedvectors?

AttributeError: 'Word2VecKeyedVectors' object has no attribute 'negative' During handling of the above exception, another exception occurred: 977 logger.info ('Model saved using code from earlier Gensim Version. Re-loading old model in a compatible way.')


2 Answers

Fixed the problem with:

from gensim import models
w = models.Word2Vec.load_word2vec_format('GoogleNews-vectors-negative300.bin', binary=True)
print w["queen"]
like image 77
Tarantula Avatar answered Oct 13 '22 17:10

Tarantula


In order to share word vector querying code between different training algos(Word2Vec, Fastext, WordRank, VarEmbed) the authors have separated storage and querying of word vectors into a separate class KeyedVectors.

Two methods and several attributes in word2vec class have been deprecated.

Methods

  • load_word2vec_format
  • save_word2vec_format

Attributes

  • syn0norm
  • syn0
  • vocab
  • index2word

These have been moved to KeyedVectors class.

After upgrading to this release you might get exceptions about deprecated methods or missing attributes.

To remove the exceptions, you should use

KeyedVectors.load_word2vec_format (instead ofWord2Vec.load_word2vec_format)
word2vec_model.wv.save_word2vec_format (instead of  word2vec_model.save_word2vec_format)
model.wv.syn0norm instead of  (model.syn0norm)
model.wv.syn0 instead of  (model.syn0)
model.wv.vocab instead of (model.vocab)
model.wv.index2word instead of (model.index2word)
like image 40
Prakhar Agarwal Avatar answered Oct 13 '22 18:10

Prakhar Agarwal