Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there an easy way to get confusion matrix for multiclass classification? (OneVsRest)

I was using OneVsRest classifier on three class classification problem, (three random forests). Occurrence of each class is defined my dummy integer (1 for occurrence, 0 for otherwise). I was wondering if there is an easy alternative way to creating confusion matrix? As all approaches I came across, takes arguments in the form of y_pred, y_train = array, shape = [n_samples]. Ideally , I would like y_pred, y_train = array , shape = [n_samples, n_classes]

SOME SAMPLE , SIMILAR TO THE STRUCTURE OF THE PROBLEM:

y_train = np.array([(1,0,0), (1,0,0), (0,0,1), (1,0,0), (0,1,0)])
y_pred = np.array([(1,0,0), (0,1,0), (0,0,1), (0,1,0), (1,0,0)])


print(metrics.confusion_matrix(y_train, y_pred) 

RETURNS: multilabel-indicator is not supported

like image 431
Gediminas Sadaunykas Avatar asked Dec 07 '16 09:12

Gediminas Sadaunykas


People also ask

How do you get the confusion matrix for multiclass classification?

Confusion Matrix gives a comparison between Actual and predicted values. The confusion matrix is a N x N matrix, where N is the number of classes or outputs. For 2 class ,we get 2 x 2 confusion matrix. For 3 class ,we get 3 X 3 confusion matrix.

What is the best evaluation metric for multiclass classification?

Most commonly used metrics for multi-classes are F1 score, Average Accuracy, Log-loss.


Video Answer


2 Answers

I don't know what you have in mind since you didn't specify the output you're looking for, but here are two ways you could go about it:

1.One confusion matrix per column

In [1]:
for i in range(y_train.shape[1]):
    print("Col {}".format(i))
    print(metrics.confusion_matrix(y_train[:,i], y_pred[:,i]))
    print("")

Out[1]:
Col 0
[[1 1]
 [2 1]]

Col 1
[[2 2]
 [1 0]]

Col 2
[[4 0]
 [0 1]]

2.One confusion matrix altogether

For this, we are going to flatten the arrays:

In [2]: print(metrics.confusion_matrix(y_train.flatten(), y_pred.flatten()))

Out[2]:
[[7 3]
 [3 2]]
like image 159
Julien Marrec Avatar answered Oct 20 '22 18:10

Julien Marrec


You can try like below to get all the details in one go.

from sklearn.metrics import confusion_matrix
confusion_matrix(y_test.argmax(axis=1), y_pred.argmax(axis=1))

This will give you something like below:

array([[ 7,  0,  0,  0],
       [ 0,  7,  0,  0],
       [ 0,  1,  2,  4],
       [ 0,  1,  0, 11]])  

-This means all diagonals are correctly predicted.

like image 29
Amaresh Avatar answered Oct 20 '22 18:10

Amaresh