I tried computing confusion-matrix for my glm model but I keep getting:
Error:
dataandreferenceshould be factors with the same levels.
Below is my model:
model3 <- glm(winner ~ srs.1 + srs.2, data = train_set, family = binomial)
confusionMatrix(table(predict(model3, newdata=test_set, type="response")) >= 0.5,
train_set$winner == 1)
winner variable contains team1 and team2.
srs.1 and srs.2 are numerical values.
What is my problem here?
I suppose your winner label is a binary of 0,1. So let's use the example below:
library(caret)
set.seed(111)
data = data.frame(
srs.1 = rnorm(200),
srs.2 = rnorm(200)
)
data$winner = ifelse(data$srs.1*data$srs.2 > 0,1,0)
idx = sample(nrow(data),150)
train_set = data[idx,]
test_set = data[-idx,]
model3 <- glm(winner ~ srs.1 + srs.2, data = train_set, family = binomial)
Like you did, we try to predict, if > 0.5, it will be 1 else 0. You got the table() about right. Note you need to do it both for test_set, or train_set:
pred = as.numeric(predict(model3, newdata=test_set, type="response")>0.5)
ref = test_set$winner
confusionMatrix(table(pred,ref))
Confusion Matrix and Statistics
ref
pred 0 1
0 12 5
1 19 14
Accuracy : 0.52
95% CI : (0.3742, 0.6634)
No Information Rate : 0.62
P-Value [Acc > NIR] : 0.943973
Kappa : 0.1085
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With