I'm getting this error "AttributeError: 'Word2Vec' object has no attribute 'index2word'" in following code in python. Anyone knows how can I solve it? Acctually "tfidf_weighted_averaged_word_vectorizer" throws the error. "obli.csv" contains line of sentences. Thank you.
from feature_extractors import tfidf_weighted_averaged_word_vectorizer
dataset = get_data2()
corpus, labels = dataset.data, dataset.target
corpus, labels = remove_empty_docs(corpus, labels)
# print('Actual class label:', dataset.target_names[labels[10]])
train_corpus, test_corpus, train_labels, test_labels = prepare_datasets(corpus,
labels,
test_data_proportion=0.3)
tfidf_vectorizer, tfidf_train_features = tfidf_extractor(train_corpus)
vocab = tfidf_vectorizer.vocabulary_
tfidf_wv_train_features = tfidf_weighted_averaged_word_vectorizer(corpus=tokenized_train,
tfidf_vectors=tfidf_train_features,
tfidf_vocabulary=vocab,
model=model,
num_features=100)
def get_data2():
obli = pd.read_csv('db/obli.csv').values.ravel().tolist()
cl0 = [0 for x in range(len(obli))]
nonObli = pd.read_csv('db/nonObli.csv').values.ravel().tolist()
cl1 = [1 for x in range(len(nonObli))]
all = obli + nonObli
db = Db(all,cl0 + cl1)
db.data = all
db.target = cl0 + cl1
return db
This is code from chapter 4 of Text Analytics for Python by Dipanjan Sarkar.
index2word in gensim has been moved since that text was published.
Instead of model.index2word you should use model.wv.index2word.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With