Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java Open Source Text Mining Frameworks [closed]

I want to know what is the best open source Java based framework for Text Mining, to use botg Machine Learning and dictionary Methods.

I'm using Mallet but there are not that much documentation and I do not know if it will fit all my requirements.

like image 579
David Campos Avatar asked Feb 20 '10 18:02

David Campos


2 Answers

I honestly think that the several answers presented here are very good. However, to fulfill my requirements I have chosen to use Apache UIMA with ClearTK. It supports several ML Methods and I do not have any licences problem. Plus, I can make wrappers to other ML methodologies, and I take the advantage of the UIMA framework, which is very well organized and fast.

Thank you all for your interesting answers.

Best Regards, ukrania

like image 153
David Campos Avatar answered Sep 17 '22 15:09

David Campos


Although not a specialized text mining framework, Weka has a number of classifiers usually employed in text mining tasks such as: SVM, kNN, multinomial NaiveBayes, among others.

It also has a few filters to wok with textual data like the StringToWordVector filter which can perform TF/IDF transformation.

Check out the Weka wiki website for more information.

like image 23
Amro Avatar answered Sep 20 '22 15:09

Amro