Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Document similarity [closed]

I used tf/idf to calculate consine similarity between two documents. It has some limitation and does not perform very well.

I looked for LDA (latent dirichlet allocation) to calculate document similarity. I don't know much about this. I couldn't find much stuff too about my problem.

Can you please provide me any tutorial related to my problem? Or can you give some advices how can i achive this task with LDA???

Thanks

P.S: also is there any source code availabe to perform such task with LDA??

like image 552
user238384 Avatar asked Nov 14 '22 12:11

user238384


1 Answers

Have you had a look at Lucene and Mahout?

This might be useful - Latent Dirichlet Allocation with Lucene and Mahout.

like image 190
Binary Nerd Avatar answered Dec 08 '22 00:12

Binary Nerd