How do you compute the true- and false- positive rates of a multi-class classification problem? Say, <pre class="prettyprint"><code>y_true = [1, -1, 0, 0, 1, -1, 1, 0, -1, 0, 1, -1, 1, 0, 0, -1, 0] y_prediction = [-1, -1, 1, 0, 0, 0, 0, -1, 1, -1, 1, 1, 0, 0, 1, 1, -1] </code></pre> The confusion matrix is computed by <code>metrics.confusion_matrix(y_true, y_prediction)</code>, but that just shifts the problem. <hr> EDIT after @seralouk's answer. Here, the class <code>-1</code> is to be considered as the negatives, while <code>0</code> and <code>1</code> are variations of positives.

Since there are several ways to solve this, and none is really generic (see https://stats.stackexchange.com/questions/202336/true-positive-false-negative-true-negative-false-positive-definitions-for-mul?noredirect=1&lq=1 and https://stats.stackexchange.com/questions/51296/how-do-you-calculate-precision-and-recall-for-multiclass-classification-using-co#51301), here is the solution that seems to be used in the paper which I was unclear about: <blockquote> to count confusion between two foreground pages as false positive </blockquote> So the solution is to <code>import numpy as np</code>, use <code>y_true</code> and <code>y_prediction</code> as <code>np.array</code>, then: <pre class="prettyprint"><code>FP = np.logical_and(y_true != y_prediction, y_prediction != -1).sum() # 9 FN = np.logical_and(y_true != y_prediction, y_prediction == -1).sum() # 4 TP = np.logical_and(y_true == y_prediction, y_true != -1).sum() # 3 TN = np.logical_and(y_true == y_prediction, y_true == -1).sum() # 1 TPR = 1. * TP / (TP + FN) # 0.42857142857142855 FPR = 1. * FP / (FP + TN) # 0.9 </code></pre>

True Positive Rate and False Positive Rate (TPR, FPR) for Multi-Class Data in python [duplicate]

Tags:

python

machine-learning

scikit-learn

confusion-matrix

multiclass-classification

How do you compute the true- and false- positive rates of a multi-class classification problem? Say,

y_true = [1, -1,  0,  0,  1, -1,  1,  0, -1,  0,  1, -1,  1,  0,  0, -1,  0]
y_prediction = [-1, -1,  1,  0,  0,  0,  0, -1,  1, -1,  1,  1,  0,  0,  1,  1, -1]

The confusion matrix is computed by metrics.confusion_matrix(y_true, y_prediction), but that just shifts the problem.

EDIT after @seralouk's answer. Here, the class -1 is to be considered as the negatives, while 0 and 1 are variations of positives.

401

asked Jun 03 '18 11:06

M K

3 Answers

Using your data, you can get all the metrics for all the classes at once:

import numpy as np
from sklearn.metrics import confusion_matrix

y_true = [1, -1,  0,  0,  1, -1,  1,  0, -1,  0,  1, -1,  1,  0,  0, -1,  0]
y_prediction = [-1, -1,  1,  0,  0,  0,  0, -1,  1, -1,  1,  1,  0,  0,  1,  1, -1]
cnf_matrix = confusion_matrix(y_true, y_prediction)
print(cnf_matrix)
#[[1 1 3]
# [3 2 2]
# [1 3 1]]

FP = cnf_matrix.sum(axis=0) - np.diag(cnf_matrix)  
FN = cnf_matrix.sum(axis=1) - np.diag(cnf_matrix)
TP = np.diag(cnf_matrix)
TN = cnf_matrix.sum() - (FP + FN + TP)

FP = FP.astype(float)
FN = FN.astype(float)
TP = TP.astype(float)
TN = TN.astype(float)

# Sensitivity, hit rate, recall, or true positive rate
TPR = TP/(TP+FN)
# Specificity or true negative rate
TNR = TN/(TN+FP) 
# Precision or positive predictive value
PPV = TP/(TP+FP)
# Negative predictive value
NPV = TN/(TN+FN)
# Fall out or false positive rate
FPR = FP/(FP+TN)
# False negative rate
FNR = FN/(TP+FN)
# False discovery rate
FDR = FP/(TP+FP)
# Overall accuracy
ACC = (TP+TN)/(TP+FP+FN+TN)

For a general case where we have a lot of classes, these metrics are represented graphically in the following image:

Confusion matrix multiclass

184

answered Oct 10 '22 23:10

seralouk

Another simple way is PyCM (by me), that supports multi-class confusion matrix analysis.

Applied to your Problem :

>>> from pycm import ConfusionMatrix
>>> y_true = [1, -1,  0,  0,  1, -1,  1,  0, -1,  0,  1, -1,  1,  0,  0, -1,  0]
>>> y_prediction = [-1, -1,  1,  0,  0,  0,  0, -1,  1, -1,  1,  1,  0,  0,  1,  1, -1]
>>> cm = ConfusionMatrix(actual_vector=y_true,predict_vector=y_prediction)
>>> print(cm)
Predict          -1       0        1        
Actual
-1               1        1        3        
0                3        2        2        
1                1        3        1        




Overall Statistics : 

95% CI                                                           (0.03365,0.43694)
Bennett_S                                                        -0.14706
Chi-Squared                                                      None
Chi-Squared DF                                                   4
Conditional Entropy                                              None
Cramer_V                                                         None
Cross Entropy                                                    1.57986
Gwet_AC1                                                         -0.1436
Joint Entropy                                                    None
KL Divergence                                                    0.01421
Kappa                                                            -0.15104
Kappa 95% CI                                                     (-0.45456,0.15247)
Kappa No Prevalence                                              -0.52941
Kappa Standard Error                                             0.15485
Kappa Unbiased                                                   -0.15405
Lambda A                                                         0.2
Lambda B                                                         0.27273
Mutual Information                                               None
Overall_ACC                                                      0.23529
Overall_RACC                                                     0.33564
Overall_RACCU                                                    0.33737
PPV_Macro                                                        0.23333
PPV_Micro                                                        0.23529
Phi-Squared                                                      None
Reference Entropy                                                1.56565
Response Entropy                                                 1.57986
Scott_PI                                                         -0.15405
Standard Error                                                   0.10288
Strength_Of_Agreement(Altman)                                    Poor
Strength_Of_Agreement(Cicchetti)                                 Poor
Strength_Of_Agreement(Fleiss)                                    Poor
Strength_Of_Agreement(Landis and Koch)                           Poor
TPR_Macro                                                        0.22857
TPR_Micro                                                        0.23529

Class Statistics :

Classes                                                          -1                      0                       1                       
ACC(Accuracy)                                                    0.52941                 0.47059                 0.47059                 
BM(Informedness or bookmaker informedness)                       -0.13333                -0.11429                -0.21667                
DOR(Diagnostic odds ratio)                                       0.5                     0.6                     0.35                    
ERR(Error rate)                                                  0.47059                 0.52941                 0.52941                 
F0.5(F0.5 score)                                                 0.2                     0.32258                 0.17241                 
F1(F1 score - harmonic mean of precision and sensitivity)        0.2                     0.30769                 0.18182                 
F2(F2 score)                                                     0.2                     0.29412                 0.19231                 
FDR(False discovery rate)                                        0.8                     0.66667                 0.83333                 
FN(False negative/miss/type 2 error)                             4                       5                       4                       
FNR(Miss rate or false negative rate)                            0.8                     0.71429                 0.8                     
FOR(False omission rate)                                         0.33333                 0.45455                 0.36364                 
FP(False positive/type 1 error/false alarm)                      4                       4                       5                       
FPR(Fall-out or false positive rate)                             0.33333                 0.4                     0.41667                 
G(G-measure geometric mean of precision and sensitivity)         0.2                     0.30861                 0.18257                 
LR+(Positive likelihood ratio)                                   0.6                     0.71429                 0.48                    
LR-(Negative likelihood ratio)                                   1.2                     1.19048                 1.37143                 
MCC(Matthews correlation coefficient)                            -0.13333                -0.1177                 -0.20658                
MK(Markedness)                                                   -0.13333                -0.12121                -0.19697                
N(Condition negative)                                            12                      10                      12                      
NPV(Negative predictive value)                                   0.66667                 0.54545                 0.63636                 
P(Condition positive)                                            5                       7                       5                       
POP(Population)                                                  17                      17                      17                      
PPV(Precision or positive predictive value)                      0.2                     0.33333                 0.16667                 
PRE(Prevalence)                                                  0.29412                 0.41176                 0.29412                 
RACC(Random accuracy)                                            0.08651                 0.14533                 0.10381                 
RACCU(Random accuracy unbiased)                                  0.08651                 0.14619                 0.10467                 
TN(True negative/correct rejection)                              8                       6                       7                       
TNR(Specificity or true negative rate)                           0.66667                 0.6                     0.58333                 
TON(Test outcome negative)                                       12                      11                      11                      
TOP(Test outcome positive)                                       5                       6                       6                       
TP(True positive/hit)                                            1                       2                       1                       
TPR(Sensitivity, recall, hit rate, or true positive rate)        0.2                     0.28571                 0.2

answered Oct 10 '22 23:10

sepandhaghighi

Since there are several ways to solve this, and none is really generic (see https://stats.stackexchange.com/questions/202336/true-positive-false-negative-true-negative-false-positive-definitions-for-mul?noredirect=1&lq=1 and https://stats.stackexchange.com/questions/51296/how-do-you-calculate-precision-and-recall-for-multiclass-classification-using-co#51301), here is the solution that seems to be used in the paper which I was unclear about:

to count confusion between two foreground pages as false positive

So the solution is to import numpy as np, use y_true and y_prediction as np.array, then:

FP = np.logical_and(y_true != y_prediction, y_prediction != -1).sum()  # 9
FN = np.logical_and(y_true != y_prediction, y_prediction == -1).sum()  # 4
TP = np.logical_and(y_true == y_prediction, y_true != -1).sum()  # 3
TN = np.logical_and(y_true == y_prediction, y_true == -1).sum()  # 1
TPR = 1. * TP / (TP + FN)  # 0.42857142857142855
FPR = 1. * FP / (FP + TN)  # 0.9

answered Oct 11 '22 00:10

M K

Related questions
                            
                                Tk treeview column sort
                            
                                Zip and apply a list of functions over a list of values in Python
                            
                                return index of least significant bit in Python
                            
                                override "private" method in Python
                            
                                Python sum() function with list parameter
                            
                                PyQt QPushButton Background color
                            
                                MultiValueDictKeyError generated in Django after POST request on login page
                            
                                How to check type of object in Python? [duplicate]
                            
                                How to download the latest file of an S3 bucket using Boto3?
                            
                                Installing Pip on Msys
                            
                                TypeError: Expected float32 passed to parameter 'y' of op 'Equal', got 'auto' of type 'str' instead
                            
                                Python global variable
                            
                                Does Python GC deal with reference-cycles like this?
                            
                                How do I remove a query from a url?
                            
                                'forms.ContactForm object' has no attribute 'hidden_tag'
                            
                                Python regex match OR operator
                            
                                Django CommandError: App 'polls' has migrations
                            
                                how to print 3x3 array in python?
                            
                                Efficiently find whether a string contains a group of characters (like substring but ignoring order)?
                            
                                Unnest (explode) a Pandas Series

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With