How do you compute the true- and false- positive rates of a multi-class classification problem? Say,
y_true = [1, -1, 0, 0, 1, -1, 1, 0, -1, 0, 1, -1, 1, 0, 0, -1, 0]
y_prediction = [-1, -1, 1, 0, 0, 0, 0, -1, 1, -1, 1, 1, 0, 0, 1, 1, -1]
The confusion matrix is computed by metrics.confusion_matrix(y_true, y_prediction)
, but that just shifts the problem.
EDIT after @seralouk's answer. Here, the class -1
is to be considered as the negatives, while 0
and 1
are variations of positives.
The hit rate (true positive rate, TPRi) is defined as rater i's positive response when the correct answer is positive (Xik = 1 and Zk = 1), and the false alarm rate (false positive rate, FPRi) is defined as a positive response when the correct answer is negative (Xik = 1 and Zk = 0).
A false positive is an outcome where the model incorrectly predicts the positive class. And a false negative is an outcome where the model incorrectly predicts the negative class. In the following sections, we'll look at how to evaluate classification models using metrics derived from these four outcomes.
It's calculated as FN/FN+TP, where FN is the number of false negatives and TP is the number of true positives (FN+TP being the total number of positives). The true positive rate (TPR, also called sensitivity) is calculated as TP/TP+FN.
The TPR defines how many correct positive results occur among all positive samples available during the test. FPR, on the other hand, defines how many incorrect positive results occur among all negative samples available during the test.
Using your data, you can get all the metrics for all the classes at once:
import numpy as np
from sklearn.metrics import confusion_matrix
y_true = [1, -1, 0, 0, 1, -1, 1, 0, -1, 0, 1, -1, 1, 0, 0, -1, 0]
y_prediction = [-1, -1, 1, 0, 0, 0, 0, -1, 1, -1, 1, 1, 0, 0, 1, 1, -1]
cnf_matrix = confusion_matrix(y_true, y_prediction)
print(cnf_matrix)
#[[1 1 3]
# [3 2 2]
# [1 3 1]]
FP = cnf_matrix.sum(axis=0) - np.diag(cnf_matrix)
FN = cnf_matrix.sum(axis=1) - np.diag(cnf_matrix)
TP = np.diag(cnf_matrix)
TN = cnf_matrix.sum() - (FP + FN + TP)
FP = FP.astype(float)
FN = FN.astype(float)
TP = TP.astype(float)
TN = TN.astype(float)
# Sensitivity, hit rate, recall, or true positive rate
TPR = TP/(TP+FN)
# Specificity or true negative rate
TNR = TN/(TN+FP)
# Precision or positive predictive value
PPV = TP/(TP+FP)
# Negative predictive value
NPV = TN/(TN+FN)
# Fall out or false positive rate
FPR = FP/(FP+TN)
# False negative rate
FNR = FN/(TP+FN)
# False discovery rate
FDR = FP/(TP+FP)
# Overall accuracy
ACC = (TP+TN)/(TP+FP+FN+TN)
For a general case where we have a lot of classes, these metrics are represented graphically in the following image:
Another simple way is PyCM (by me), that supports multi-class confusion matrix analysis.
Applied to your Problem :
>>> from pycm import ConfusionMatrix
>>> y_true = [1, -1, 0, 0, 1, -1, 1, 0, -1, 0, 1, -1, 1, 0, 0, -1, 0]
>>> y_prediction = [-1, -1, 1, 0, 0, 0, 0, -1, 1, -1, 1, 1, 0, 0, 1, 1, -1]
>>> cm = ConfusionMatrix(actual_vector=y_true,predict_vector=y_prediction)
>>> print(cm)
Predict -1 0 1
Actual
-1 1 1 3
0 3 2 2
1 1 3 1
Overall Statistics :
95% CI (0.03365,0.43694)
Bennett_S -0.14706
Chi-Squared None
Chi-Squared DF 4
Conditional Entropy None
Cramer_V None
Cross Entropy 1.57986
Gwet_AC1 -0.1436
Joint Entropy None
KL Divergence 0.01421
Kappa -0.15104
Kappa 95% CI (-0.45456,0.15247)
Kappa No Prevalence -0.52941
Kappa Standard Error 0.15485
Kappa Unbiased -0.15405
Lambda A 0.2
Lambda B 0.27273
Mutual Information None
Overall_ACC 0.23529
Overall_RACC 0.33564
Overall_RACCU 0.33737
PPV_Macro 0.23333
PPV_Micro 0.23529
Phi-Squared None
Reference Entropy 1.56565
Response Entropy 1.57986
Scott_PI -0.15405
Standard Error 0.10288
Strength_Of_Agreement(Altman) Poor
Strength_Of_Agreement(Cicchetti) Poor
Strength_Of_Agreement(Fleiss) Poor
Strength_Of_Agreement(Landis and Koch) Poor
TPR_Macro 0.22857
TPR_Micro 0.23529
Class Statistics :
Classes -1 0 1
ACC(Accuracy) 0.52941 0.47059 0.47059
BM(Informedness or bookmaker informedness) -0.13333 -0.11429 -0.21667
DOR(Diagnostic odds ratio) 0.5 0.6 0.35
ERR(Error rate) 0.47059 0.52941 0.52941
F0.5(F0.5 score) 0.2 0.32258 0.17241
F1(F1 score - harmonic mean of precision and sensitivity) 0.2 0.30769 0.18182
F2(F2 score) 0.2 0.29412 0.19231
FDR(False discovery rate) 0.8 0.66667 0.83333
FN(False negative/miss/type 2 error) 4 5 4
FNR(Miss rate or false negative rate) 0.8 0.71429 0.8
FOR(False omission rate) 0.33333 0.45455 0.36364
FP(False positive/type 1 error/false alarm) 4 4 5
FPR(Fall-out or false positive rate) 0.33333 0.4 0.41667
G(G-measure geometric mean of precision and sensitivity) 0.2 0.30861 0.18257
LR+(Positive likelihood ratio) 0.6 0.71429 0.48
LR-(Negative likelihood ratio) 1.2 1.19048 1.37143
MCC(Matthews correlation coefficient) -0.13333 -0.1177 -0.20658
MK(Markedness) -0.13333 -0.12121 -0.19697
N(Condition negative) 12 10 12
NPV(Negative predictive value) 0.66667 0.54545 0.63636
P(Condition positive) 5 7 5
POP(Population) 17 17 17
PPV(Precision or positive predictive value) 0.2 0.33333 0.16667
PRE(Prevalence) 0.29412 0.41176 0.29412
RACC(Random accuracy) 0.08651 0.14533 0.10381
RACCU(Random accuracy unbiased) 0.08651 0.14619 0.10467
TN(True negative/correct rejection) 8 6 7
TNR(Specificity or true negative rate) 0.66667 0.6 0.58333
TON(Test outcome negative) 12 11 11
TOP(Test outcome positive) 5 6 6
TP(True positive/hit) 1 2 1
TPR(Sensitivity, recall, hit rate, or true positive rate) 0.2 0.28571 0.2
Since there are several ways to solve this, and none is really generic (see https://stats.stackexchange.com/questions/202336/true-positive-false-negative-true-negative-false-positive-definitions-for-mul?noredirect=1&lq=1 and https://stats.stackexchange.com/questions/51296/how-do-you-calculate-precision-and-recall-for-multiclass-classification-using-co#51301), here is the solution that seems to be used in the paper which I was unclear about:
to count confusion between two foreground pages as false positive
So the solution is to import numpy as np
, use y_true
and y_prediction
as np.array
, then:
FP = np.logical_and(y_true != y_prediction, y_prediction != -1).sum() # 9
FN = np.logical_and(y_true != y_prediction, y_prediction == -1).sum() # 4
TP = np.logical_and(y_true == y_prediction, y_true != -1).sum() # 3
TN = np.logical_and(y_true == y_prediction, y_true == -1).sum() # 1
TPR = 1. * TP / (TP + FN) # 0.42857142857142855
FPR = 1. * FP / (FP + TN) # 0.9
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With