Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python vs Java for natural language processing [closed]

I have been working on java to find the similarity between two documents. I prefer finding semantic similarity , but havent made efforts to find it yet . I am using the following approach .

  1. Extract terms / tokens (I am using JAWS with wordnet to remove synonyms thus improves the similarities )
  2. make a term document matrix
  3. LSA
  4. Cosine similarity

When i was looking at few stackoverflow pages , i got quite a few links to python implementations.

I would like to know if python is a better language to find the text similarity and would also like to know if i can find semantic similairty between two documents in python

like image 840
CTsiddharth Avatar asked Nov 04 '22 05:11

CTsiddharth


1 Answers

Assuming you don't have a platform restriction that would constrain your choice of language, you should choose your language based on whatever you're most comfortable with (I prefer Python myself), and which has the best libraries for your application (as @GregHewgill pointed out the Python tools (Natural Language Toolkit) are mature and comprehensive).

So while I personally would choose Python, it's really something you have to choose for yourself.

== EDIT ==

This question about Java NLP libraries might help you decide if you can use Java for your analysis; the top answer has a list you can investigate. Without more information about your problem set, I can't provide more specific advice.

like image 69
ironchefpython Avatar answered Nov 07 '22 23:11

ironchefpython