I have 2 data sets for train and test with weka. They both having same amount of attributes and same type data type for variables (numeric or nominal) .But they are not compatible with each other because the order of nominal values is different
ex - Training set
Occupation
1 Doctor 40%
2 Engineer 40%
3 Teacher 20%
Test set
1 Engineer 40%
2 doctor 40%
3 Teacher 20%
So both sets are incompatible. My question is how to change these distinct value order to make them compatible?
It looks a bit like a data pre-processing issue. I am quite curious as to how the training and testing data ended up looking like this!
If you would like to change the nominal values, you could use RenameNominalValues to rename the labels of your data. One possible method is to apply this to your testing data:
This solution assumes that you are dealing with a Nominal attribute, that it is your last attribute and they are labelled as shown in the valueReplacements field.
Failing this, depending on the amount of cases, you could edit the values manually or use your favourite spreadsheet to replace the values.
Hope this Helps!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With