Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R Generic solution to create 2*2 confusion matrix

My question is related to this one on producing a confusion matrix in R with the table() function. I am looking for a solution without using a package (e.g. caret).

Let's say these are our predictions and labels in a binary classification problem:

predictions <- c(0.61, 0.36, 0.43, 0.14, 0.38, 0.24, 0.97, 0.89, 0.78, 0.86, 0.15,  0.52, 0.74, 0.24)
labels      <- c(1,    1,    1,    0,    0,     1,    1,    1,    0,     1,    0,    0,    1,    0)

For these values, the solution below works well to create a 2*2 confusion matrix for, let's say, threshold = 0.5:

# Confusion matrix for threshold = 0.5
conf_matrix <- as.matrix(table(predictions>0.5,labels))
  conf_matrix
     labels
       0 1
 FALSE 4 3
 TRUE  2 5

However, I do not get a 2*2 matrix if I select any value that is smaller than min(predictions) or larger than max(predictions), since the data won't have either a FALSE or TRUE occurrence e.g.:

conf_matrix <- as.matrix(table(predictions>0.05,labels))
  conf_matrix
     labels
       0 1
  TRUE 6 8

I need a method that consistently produces a 2*2 confusion matrix for all possible thresholds (decision boundaries) between 0 and 1, as I use this as an input in an optimisation. Is there a way I can tweak the table function so it always returns a 2*2 matrix here?

like image 758
Zhubarb Avatar asked Sep 11 '14 13:09

Zhubarb


1 Answers

You can make your thresholded prediction a factor variable to achieve this:

(conf_matrix <- as.matrix(table(factor(predictions>0.05, levels=c(F, T)), labels)))
#        labels
#         0 1
#   FALSE 0 0
#   TRUE  6 8
like image 158
josliber Avatar answered Sep 30 '22 21:09

josliber