Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

gensim - Word2vec continue training on existing model - AttributeError: 'Word2Vec' object has no attribute 'compute_loss'

I am trying to continue training on an existing model,

model = gensim.models.Word2Vec.load('model/corpus.zhwiki.word.model')
more_sentences = [['Advanced', 'users', 'can', 'load', 'a', 'model', 'and', 'continue', 'training', 'it', 'with', 'more', 'sentences']]    
model.build_vocab(more_sentences, update=True)
model.train(more_sentences, total_examples=model.corpus_count, epochs=model.iter)

but I got an error with the last line:

AttributeError: 'Word2Vec' object has no attribute 'compute_loss'

Some posts said it's caused by using a earlier version of gensim, and I have tried to add this after loading the existing model and before train().

model.compute_loss = False

After that, it didn't give me the AttributeError, but the output of model.train() is 0, and model didn't trained with new sentences.

enter image description here

How to solve this problem?

like image 758
dididaisy Avatar asked Jan 25 '18 13:01

dididaisy


2 Answers

Here is how I continues training my model

# training_data: initial training data. contain list of tokenized sentences
model = Word2Vec(training_data, size=50, window=5, min_count=10, workers=4)

# datasmall: more sentences
# total_examples: number of additional sentence
# epochs: provide your current epochs. model.epochs is ok 
model.train(datasmall, total_examples=len(datasmall), epochs=model.epochs)
like image 179
Haha TTpro Avatar answered Nov 01 '22 05:11

Haha TTpro


The total_examples (and epochs) arguments to train() should match what you're currently providing, in your more_sentences – not leftover values from prior training.

So for example, given your code showing just a single additional sentence, you'd specify total_examples=1.

If this isn't the source of the problem, double check that more_sentences is what you expect it to be at the time of the train() call.

like image 22
gojomo Avatar answered Nov 01 '22 06:11

gojomo