I am working with a multi-class multi-label output from my classifier. The total number of classes is 14 and instances can have multiple classes associated. For example:
y_true = np.array([[0,0,1], [1,1,0],[0,1,0])
y_pred = np.array([[0,0,1], [1,0,1],[1,0,0])
The way I am making my confusion matrix right now:
matrix = confusion_matrix(y_true.argmax(axis=1), y_pred.argmax(axis=1))
print(matrix)
Which gives an output like:
[[ 79 0 0 0 66 0 0 151 1 8 0 0 0 0]
[ 4 0 0 0 11 0 0 27 0 0 0 0 0 0]
[ 14 0 0 0 21 0 0 47 0 1 0 0 0 0]
[ 1 0 0 0 4 0 0 25 0 0 0 0 0 0]
[ 18 0 0 0 50 0 0 63 0 3 0 0 0 0]
[ 4 0 0 0 3 0 0 19 0 0 0 0 0 0]
[ 2 0 0 0 3 0 0 11 0 2 0 0 0 0]
[ 22 0 0 0 20 0 0 138 1 5 0 0 0 0]
[ 12 0 0 0 9 0 0 38 0 1 0 0 0 0]
[ 10 0 0 0 3 0 0 40 0 4 0 0 0 0]
[ 3 0 0 0 3 0 0 14 0 3 0 0 0 0]
[ 0 0 0 0 2 0 0 3 0 0 0 0 0 0]
[ 2 0 0 0 11 0 0 32 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 3 0 0 0 0 0 7]]
Now, I am not sure if the confusion matrix from sklearn is capable of handling multi-label multi-class data. Could someone help me with this?
What you need to do is to generate multiple binary confusion matrices (since essentially what you have are multiple binary labels)
Something along the lines of:
import numpy as np
from sklearn.metrics import confusion_matrix
y_true = np.array([[0,0,1], [1,1,0],[0,1,0]])
y_pred = np.array([[0,0,1], [1,0,1],[1,0,0]])
labels = ["A", "B", "C"]
conf_mat_dict={}
for label_col in range(len(labels)):
y_true_label = y_true[:, label_col]
y_pred_label = y_pred[:, label_col]
conf_mat_dict[labels[label_col]] = confusion_matrix(y_pred=y_pred_label, y_true=y_true_label)
for label, matrix in conf_mat_dict.items():
print("Confusion matrix for label {}:".format(label))
print(matrix)
Now you can use (version 0.21) sklearn.metrics.multilabel_confusion_matrix
https://scikit-learn.org/stable/modules/generated/sklearn.metrics.multilabel_confusion_matrix.html
We try to predict two labels for each example
import sklearn.metrics as skm
y_true = np.array([
[0,0], [0,1], [1,1], [0,1], [0,1], [1,1]
])
y_pred = np.array([
[1,1], [0,1], [0,1], [1,0], [0,1], [1,1]
])
cm = skm.multilabel_confusion_matrix(y_true, y_pred)
print(cm)
print( skm.classification_report(y_true,y_pred))
Confusion matrix for labels:
[[[2 2]
[1 1]]
[[0 1]
[1 4]]]
Classification report:
precision recall f1-score support
0 0.33 0.50 0.40 2
1 0.80 0.80 0.80 5
micro avg 0.62 0.71 0.67 7
macro avg 0.57 0.65 0.60 7
weighted avg 0.67 0.71 0.69 7
samples avg 0.67 0.58 0.61 7
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With