Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to compute word similarity using TF-IDF or LSA with gensim?

I know that word2vec in gensim can compute similarity between words. But now I want to compute word similarity using TF-IDF or LSA with gensim. How to do it?

note: Computing document similarity using LSA with gensim is easy: http://radimrehurek.com/gensim/wiki.html

like image 540
hankaixyz Avatar asked Dec 05 '25 09:12

hankaixyz


1 Answers

TF-IDF is a weighting scheme so it's not an alternative to LSA.

Imagine your problem as a matrix of "m" terms by "n" documents. Each entry Aij of your matrix represents the weight of term "i" in document "j". This is where you use TF-IDF. To know what to put in each cell of the matrix.

Then if it suits your application you can reduce the dimensions of the matrix using LSA.

I hope this clears a little the issue.

like image 68
backtrack Avatar answered Dec 07 '25 12:12

backtrack



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!