Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R Confusion Matrix sensitivity and specificity labeling

I am using R v3.3.2 and Caret 6.0.71 (i.e. latest versions) to construct a logistic regression classifier. I am using the confusionMatrix function to create stats for judging its performance.

logRegConfMat <- confusionMatrix(logRegPrediction, valData[,"Seen"])

  • Reference 0, Prediction 0 = 30
  • Reference 1, Prediction 0 = 14
  • Reference 0, Prediction 1 = 60
  • Reference 1, Prediction 1 = 164

Accuracy : 0.7239
Sensitivity : 0.3333
Specificity : 0.9213

The target value in my data (Seen) uses 1 for true and 0 for false. I assume the Reference (Ground truth) columns and Predication (Classifier) rows in the confusion matrix follow the same convention. Therefore my results show:

  • True Negatives (TN) 30
  • True Positives (TP) 164
  • False Negatives (FN) 14
  • False Positives (FP) 60

Question: Why is sensitivity given as 0.3333 and specificity given as 0.9213? I would have thought it was the other way round - see below.

I am reluctant to believe that there is bug in the R confusionMatrix function as nothing has been reported and this seems to be a significant error.


Most references about calculating specificity and sensitivity define them as follows - i.e. www.medcalc.org/calc/diagnostic_test.php

  • Sensitivity = TP / (TP+FN) = 164/(164+14) = 0.9213
  • Specificity = TN / (FP+TN) = 30/(60+30) = 0.3333
like image 532
wpqs Avatar asked Jan 03 '17 11:01

wpqs


1 Answers

According to the documentation ?confusionMatrix:

"If there are only two factor levels, the first level will be used as the "positive" result."

Hence in your example positive result will be 0, and evaluation metrics will be the wrong way around. To override default behaviour, you can set the argument positive = to the correct value, alas:

 confusionMatrix(logRegPrediction, valData[,"Seen"], positive = "1")
like image 104
mtoto Avatar answered Nov 07 '22 11:11

mtoto