Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Weka Attribute Selection

I'm trying to perform Attribute Selection in Weka. I would like to use InfoGainAttributeEval as an evaluator, because I read that it is equivalent to mutual information, and Ranker as a search method. Should I perform attribute selection to both training and test set? Also, how can I choose the correct value for the N parameter?

Thanks a lot for your time,

Nadia

like image 447
nadia Avatar asked Nov 13 '22 22:11

nadia


1 Answers

Applying attribute selection separately on the train and test might result in a selection of different attributes, thereby making them incompatible. Thus to make sure that both sets have the same attributes you need to apply attribute selection on your whole dataset. Once you have selected the most useful attributes you split your data into a train and test set.

As to which value of -N to use, I would use your total amount of attributes. This will result in a ranked list of all your attributes and you can evaluate the different scores of all attributes yourself. You might then spot a clear threshold separating the attributes holding any useful information to train a classifier from attributes which add nothing. I would then set this threshold using the -T option.

like image 58
Sicco Avatar answered Dec 30 '22 10:12

Sicco