I'm training my own word2vec model using different data. To implement the resulting model into my classifier and compare the results with the original pre-trained Word2vec model I need to save the model in binary extension .bin. Here is my code, sentences is a list of short messages.
import gensim, logging
logging.basicConfig(format='%(asctime)s : %(levelname)s : %(message)s', level=logging.INFO)
sentences = gensim.models.word2vec.LineSentence('dati.txt')
model = gensim.models.Word2Vec(
sentences, size=300, window=5, min_count=5, workers=5,
sg=1, hs=1, negative=0
)
model.save_word2vec_format('model.bin', binary=True)
The last method, save_word2vec_format, gives me this error:
AttributeError: 'Word2Vec' object has no attribute 'save_word2vec_format'
What am I missing here? I've read the documentation of gensim and other forums. This repo on github uses almost the same configuration so I cannot understand what's wrong. I've tried to switch from skipgram to cbow and from hierarchical softmax to negative sampling with no results.
Thank you in advance!
from gensim.models import Word2Vec, KeyedVectors
model.wv.save_word2vec_format('model.bin', binary=True)
Are you using a pre-release release candidate version of gensim, or code directly from the develop
branch?
In those versions save_word2vec_format()
has moved to a utility class called KeyedVectors
.
You won't yet (as of February 2017) get these versions from the usual way of installing gensim, pip install gensim
– and it's likely that by the time this change is in the official distribution, the error message for trying the older call will be improved.
I recommend using the version that comes via plain pip install gensim
unless you are a relatively expert user who is also carefully following the project CHANGELOG.md.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With