How compute confusion matrix for multiclass classification in Scikit?

I have a multiclass classification task. When I run my script based on the scikit example as the follows:

classifier = OneVsRestClassifier(GradientBoostingClassifier(n_estimators=70, max_depth=3, learning_rate=.02))

y_pred = classifier.fit(X_train, y_train).predict(X_test)
cnf_matrix = confusion_matrix(y_test, y_pred)

I get this error:

File "C:\ProgramData\Anaconda2\lib\site-packages\sklearn\metrics\classification.py", line 242, in confusion_matrix
    raise ValueError("%s is not supported" % y_type)
ValueError: multilabel-indicator is not supported

I tried to pass the labels=classifier.classes_ to confusion_matrix(), but it doesn't help.

y_test and y_pred are as the follow:

y_test =
array([[0, 0, 0, 1, 0, 0],
   [0, 0, 0, 0, 1, 0],
   [0, 1, 0, 0, 0, 0],
   ..., 
   [0, 0, 0, 0, 0, 1],
   [0, 0, 0, 1, 0, 0],
   [0, 0, 0, 0, 1, 0]])


y_pred = 
array([[0, 0, 0, 0, 0, 0],
   [0, 0, 0, 0, 0, 0],
   [0, 0, 0, 0, 0, 0],
   ..., 
   [0, 0, 0, 0, 0, 1],
   [0, 0, 0, 0, 0, 1],
   [0, 0, 0, 0, 0, 0]])

How do you calculate multiclass of a confusion matrix?

Confusion Matrix gives a comparison between Actual and predicted values. The confusion matrix is a N x N matrix, where N is the number of classes or outputs. For 2 class ,we get 2 x 2 confusion matrix. For 3 class ,we get 3 X 3 confusion matrix.

How do you calculate accuracy from confusion matrix for multiclass?

Accuracy is one of the most popular metrics in multi-class classification and it is directly computed from the confusion matrix. The formula of the Accuracy considers the sum of True Positive and True Negative elements at the numerator and the sum of all the entries of the confusion matrix at the denominator.

What is Confusion_matrix in Scikit learn?

A confusion matrix is a tabular summary of the number of correct and incorrect predictions made by a classifier. It can be used to evaluate the performance of a classification model through the calculation of performance metrics like accuracy, precision, recall, and F1-score.

This worked for me:

y_test_non_category = [ np.argmax(t) for t in y_test ]
y_predict_non_category = [ np.argmax(t) for t in y_predict ]

from sklearn.metrics import confusion_matrix
conf_mat = confusion_matrix(y_test_non_category, y_predict_non_category)

where y_test and y_predict are categorical variables like one-hot vectors.

First you need to create the label output array. Lets say you have 3 classes: 'cat', 'dog', 'house' indexed: 0,1,2 . And the prediction for 2 samples is: 'dog', 'house'. Your output will be:

y_pred = [[0, 1, 0],[0, 0, 1]]

run y_pred.argmax(1) to get: [1,2] This array stands for the original label indexes, meaning: ['dog', 'house']

num_classes = 3

# from lable to categorial
y_prediction = np.array([1,2]) 
y_categorial = np_utils.to_categorical(y_prediction, num_classes)

# from categorial to lable indexing
y_pred = y_categorial.argmax(1)

How compute confusion matrix for multiclass classification in Scikit?

Tags:

python

classification

scikit-learn

confusion-matrix

YNR

People also ask

2 Answers

Azhar Khan

Naomi Fridman

Recent Activity

Donate For Us

How compute confusion matrix for multiclass classification in Scikit?

Tags:

python

classification

scikit-learn

confusion-matrix

YNR

People also ask

2 Answers

Azhar Khan

Naomi Fridman

Related questions

Recent Activity

Donate For Us