I'm doing a binary classification. Whenever my prediction equals the ground truth, I find sklearn.metrics.confusion_matrix
to return a single value. Isn't there a problem?
from sklearn.metrics import confusion_matrix
print(confusion_matrix([True, True], [True, True])
# [[2]]
I would expect something like:
[[2 0]
[0 0]]
What is a confusion matrix? It is a table that is used in classification problems to assess where errors in the model were made. The rows represent the actual classes the outcomes should have been. While the columns represent the predictions we have made.
The “normalized” term means that each of these groupings is represented as having 1.00 samples. Thus, the sum of each row in a balanced and normalized confusion matrix is 1.00, because each row sum represents 100% of the elements in a particular topic, cluster, or class.
In your case understand that the 4*4 matrix denotes that you have 4 different values in your predicted variable, namely:AGN,BeXRB,HMXB,SNR. One thing more, the correct classification of the values will be on the diagonal running from top-left to bottom-right and all the other values are misclassified.
You should fill-in labels=[True, False]
:
from sklearn.metrics import confusion_matrix
cm = confusion_matrix(y_true=[True, True], y_pred=[True, True], labels=[True, False])
print(cm)
# [[2 0]
# [0 0]]
From the docs, the output of confusion_matrix(y_true, y_pred)
is:
C: ndarray of shape (n_classes, n_classes)
The variable n_classes
is either:
y_true
or y_pred
labels
In your case, because you did not fill in labels
, the variable n_classes
is guessed from the number of unique values in [True, True]
which is 1. Hence the result.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With