Unseen nominal values in weka

Question

I have a dataset with some nominal values as features. The training set I have has a set of values for the nominal features which are absent in my test set. For instance my feature in the training set corresponds to

@attribute h4 {br,pl,com,ro,th,np}

and the same feature in the test set has

@attribute h4 {br,pl,abc,th,def,ghi,lmno}

I believe because of this, weka is not allowing me to re-evaluate the model I built on my training set on my test set. Is there a way around this? Am I missing something?

EDIT: I'm using a RandomForest classifier.

Thanks

Gökhan Çoban · Accepted Answer

Weka seeks all the nominal values used in test set to be exist in training set too because the classifier should learn before making predictions.

Also Weka uses nominal values with their indices; thus, it is important to use same order for nominal values of the same attribute to get reliable results.

In your case, just use the same values -that covers all values- in the same order for both training set and test set.

Your combined values {br,pl,com,ro,th,np,abc,th,def,ghi,lmno} can be used for both training set and test set.

Unseen nominal values in weka

Tags:

machine-learning

supervised-learning

weka

DaTaBomB

1 Answers

Gökhan Çoban

Recent Activity

Donate For Us

Unseen nominal values in weka

Tags:

machine-learning

supervised-learning

weka

DaTaBomB

1 Answers

Gökhan Çoban

Related questions

Recent Activity

Donate For Us