Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

The best way to calculate classification accuracy?

I know one formula to calculate classification accuracy is X = t / n * 100 (where t is the number of correct classification and n is the total number of samples. )

But, let's say we have total 100 samples, 80 in class A, 10 in class B, 10 in class C.

Scenario 1: All 100 samples were assigned to class A, by using the formula, we got accuracy equals 80%.

Scenario 2: 10 samples belong to B were correctly assigned to class B ;10 samples belong to C were correctly assigned to class C as well; 30 samples belong to A correctly assigned to class A; the rest 50 samples belong to A were incorrectly assigned to C. By using the formula, we got accuracy of 50%.

My question is:

1: Can we say scenario 1 has a higher accuracy rate then scenario 2?

2: Is there any way to calculate accuracy rate for classification problem?

Many thanks ahead!

like image 610
SimpleDreamful Avatar asked Oct 18 '25 15:10

SimpleDreamful


1 Answers

Classification accuracy is defined as "percentage of correct predictions". That is the case regardless of the number of classes. Thus, scenario 1 has a higher classification accuracy than scenario 2.

However, it sounds like what you are really asking is for an alternative evaluation metric or process that "rewards" scenario 2 for only making certain types of mistakes. I have two suggestions:

  1. Create a confusion matrix: It describes the performance of a classifier so that you can see what types of errors your classifier is making.
  2. Calculate the precision, recall, and F1 score for each class. The average F1 score might be the single-number metric you are looking for.

The Classification metrics section of the scikit-learn documentation has lots of good information about classifier evaluation, even if you are not a scikit-learn user.

like image 181
Kevin Markham Avatar answered Oct 20 '25 10:10

Kevin Markham