Resample Filter of WEKA - How to interpret the result

Question

I am currently strugeling with a machine learning problem whereas I have to deal with great unbalanced data sets. That is, there are six classes ('1','2'...'6'). Unfortunately there are e.g. for class '1' 150 examples/instances, for '2' 90 instances and for class '3' only 20. All other classes can't be "trained" since there are no available instances for these classes.

So far, I figured out that WEKA (the machine learning toolkit I am using) provides this supervised "Resample" filter. When I apply this filter with 'noReplacement'=false and 'bialToUniformClass'=1.0 then this results in a data set, where the the number of instances is nice and almost equal (for class '1'..'3' and the others stay empty).

My question is now: how does WEKA and this filter generate "new"/additional instances for different classes.

Thank you very much in advance for any hints or suggestions.

Cheers Julian

James · Accepted Answer

It doesn't. It's resampling existing instances. If you have one class-2 instance, and ask for a resampling with a bias of 1.0, you can expect N copies of that instance and N other instances of each other type for which there is already data.

Julian · Answer

Using WEKA's supervised Resample filter adds instances to a class. This realized by simply adding instances from the class which has only few instances multiple times to the result data set.

Therefore the resulting data set is strongly biased in terms of a class for which only few samples are available.

Resample Filter of WEKA - How to interpret the result

Tags:

Julian

2 Answers

James

Julian

Recent Activity

Donate For Us

Resample Filter of WEKA - How to interpret the result

Tags:

Julian

2 Answers

James

Julian

Related questions

Recent Activity

Donate For Us