Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why scikit learn confusion matrix is reversed?

I have 3 questions:

1)

The confusion matrix for sklearn is as follows:

TN | FP
FN | TP

While when I'm looking at online resources, I find it like this:

TP | FP
FN | TN

Which one should I consider?

2)

Since the above confusion matrix for scikit learn is different than the one I find in other rescources, in a multiclass confusion matrix, what's the structure will be? I'm looking at this post here: Scikit-learn: How to obtain True Positive, True Negative, False Positive and False Negative In that post, @lucidv01d had posted a graph to understand the categories for multiclass. is that category the same in scikit learn?

3)

How do you calculate the accuracy of a multiclass? for example, I have this confusion matrix:

[[27  6  0 16]
 [ 5 18  0 21]
 [ 1  3  6  9]
 [ 0  0  0 48]]

In that same post I referred to in question 2, he has written this equation:

Overall accuracy

ACC = (TP+TN)/(TP+FP+FN+TN)

but isn't that just for binary? I mean, for what class do I replace TP with?

like image 541
John Sall Avatar asked May 10 '19 12:05

John Sall


People also ask

What does Sklearn confusion matrix return?

Sklearn confusion_matrix() returns the values of the Confusion matrix. The output is, however, slightly different from what we have studied so far. It takes the rows as Actual values and the columns as Predicted values.

Why do we use confusion matrix in python?

It is a table that is used in classification problems to assess where errors in the model were made. The rows represent the actual classes the outcomes should have been. While the columns represent the predictions we have made. Using this table it is easy to see which predictions are wrong.

What is Confusion_matrix Sklearn?

By definition a confusion matrix is such that C i , j is equal to the number of observations known to be in group and predicted to be in group . Thus in binary classification, the count of true negatives is C 0 , 0 , false negatives is C 1 , 0 , true positives is C 1 , 1 and false positives is C 0 , 1 .


1 Answers

The reason why sklearn has show their confusion matrix like

TN | FP
FN | TP

like this is because in their code, they have considered 0 to be the negative class and one to be positive class. sklearn always considers the smaller number to be negative and large number to positive. By number, I mean the class value (0 or 1). The order depends on your dataset and class.

The accuracy will be the sum of diagonal elements divided by the sum of all the elements.p The diagonal elements are the number of correct predictions.

like image 100
secretive Avatar answered Nov 01 '22 08:11

secretive