Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

WEKA : Cost Matrix Interpretation

Tags:

weka

How do we interpret the cost matrix in WEKA? If I have 2 classes to predict (class 0 and class 1) and want to penalize classfication of class 0 as class 1 more (say double the penalty), what exactly is the matrix format?

Is it :

 0 10
20  0

or is it

 0 20
10  0

The source of confusion are the following two references:

1) The JavaDoc for Weka CostMatrix says:

The element at position i,j in the matrix is the penalty for classifying an instance of class j as class i.

2) However, the answer in this post seems to indicate otherwise.

http://weka.8497.n7.nabble.com/cost-matrix-td5821.html

Given the first cost matrix, the post says "Misclassifying an instance of class 0 incurs a cost of 10. Misclassifying an instance of class 1 is twice as costly.

Thanks.

like image 285
user2549371 Avatar asked Jul 04 '13 07:07

user2549371


1 Answers

I know my answer is coming very late, but it might help somebody so here it is:

To boost the cost of classifying an item of class 0 as class 1, the correct format is the second one.

The evidence:

Cost Matrix I used:

 0        1.0
 1000.0   0

Confusion matrix (from cross-validation):

   a   b   <-- classified as
 565  20 |   a = ignored
  54 204 |   b = not_ignored

Cross-validation output:

...
Total Cost                           54020
...

That's a cost of 54 * 10000 + 20 * 1, which matches the confusion matrix above.

like image 113
Elhu Avatar answered Oct 04 '22 03:10

Elhu