Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Mallet vs Weka for text classification [closed]

Tags:

weka

mallet

Which product (Mallet or Weka) is better for text classification task:

  1. Simpler to train
  2. Better results
  3. Documentation

I'm new for this problem so any comments will be great

like image 953
fedor.belov Avatar asked Oct 31 '11 12:10

fedor.belov


2 Answers

MALLET is much easier to use and does most of its job invisibly. You don't have to convert the format of anything either, you just give it text files and it gives you back results.

Weka requires converting the text into a particular format (the Weka script for doing so it so slow and inefficient that I would recommend you write your own).

The problem with MALLET is that the training uses GB of memory and it can take hours, if you have large training sets.

Weka has more documentation, but most of it makes no sense. MALLET has very little documentation but is very simple to use.

To be honest, after testing the both of them, I opted for writing my own classifier.

like image 131
Alasdair Avatar answered Nov 18 '22 01:11

Alasdair


I'm really enjoying Weka vs Mallet. Maybe I don't know enough yet, but doing machine learning with a GUI is awesome. You can tweak parameters and run different experiments (keeping the results of past experiments in front of you, too) very easily. I'm new to Weka, so this is FWIW.

As far as which one is simpler to train, I find Weka simpler. I don't know what kind of control you can have over your feature space by just pointing Mallet at some text (maybe it's good enough), but my experience with Mallet was comparable to Weka... writing scripts to get the input in the proper format, with the caveat that I had to do multiple steps to utilize some kind of serialized version of the data in Mallet.

Regarding your other questions, I can't really answer them right now, but am hoping this answer doesn't get downvoted 'cause it's good information to be out there, anyway.

like image 44
Walrus the Cat Avatar answered Nov 18 '22 01:11

Walrus the Cat