Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Data mining engines and frameworks? [closed]

What opensource/free data mining engines and frameworks do you know and use for textual data?

Thank you for any advice!

like image 418
Edward83 Avatar asked Nov 18 '10 00:11

Edward83


4 Answers

Not really sure of what you're looking for. Perhaps something like Lucene?

like image 59
Matt Ball Avatar answered Oct 31 '22 19:10

Matt Ball


Apache Mahout is an OpenSource Machile Learning library, that can be used with or without MapReduce (Apache Hadoop).

It provides the folloeing algorithms implementation in Java:

  • Collaborative Filtering
  • User and Item based recommenders
  • K-Means, Fuzzy K-Means clustering
  • Mean Shift clustering
  • Dirichlet process clustering
  • Latent Dirichlet Allocation
  • Singular value decomposition
  • Parallel Frequent Pattern mining
  • Complementary Naive Bayes classifier
  • Random forest decision tree based classifier

You can read more: http://mahout.apache.org/

http://girlincomputerscience.blogspot.com.br/2010/11/apache-mahout.html

http://www.ibm.com/developerworks/java/library/j-mahout/

like image 37
Renata Avatar answered Oct 31 '22 19:10

Renata


RapidMiner is free and open source and runs on windows, mac, linux, and is a nice graphical workflow based program. It runs all Weka code, and integrates with R.

like image 2
Neil McGuigan Avatar answered Oct 31 '22 18:10

Neil McGuigan


Weka and Rapidminer aren't that strong on clustering. They mostly do classification and similar predictions, but very little clustering. Have a look at ELKI, which is like WEKA a university project, but has tons of clustering and outlier detection methods.

like image 2
Has QUIT--Anony-Mousse Avatar answered Oct 31 '22 20:10

Has QUIT--Anony-Mousse