Deciding threshold for glm logistic regression model in R

Question

I have some data with predictors and a binary target. Eg:

df <- data.frame(a=sort(sample(1:100,30)), b= sort(sample(1:100,30)), 
                 target=c(rep(0,11),rep(1,4),rep(0,4),rep(1,11)))

I trained a logistic regresion model using glm()

model1 <- glm(formula= target ~ a + b, data=df, family=binomial)

Now I'm trying to predict the output (for the example, the same data should suffice)

predict(model1, newdata=df, type="response")

This generates a vector of probability numbers. But I want to predict the actual class. I could use round() on the probablity numbers, but this assumes that anything below 0.5 is class '0', and anything above is class '1'. Is this a correct assumption? Even when the population of each class may not be equal (or close to equal)? Or is there a way to estimate this threshold?

Error404 · Accepted Answer

The best threshold (or cutoff) point to be used in glm models is the point which maximises the specificity and the sensitivity. This threshold point might not give the highest prediction in your model, but it wouldn't be biased towards positives or negatives. The ROCR package contain functions that can help you do this. check the performance() function in this package. It is going to get you what you're looking for. Here's a picture of what you are expecting to get:

enter image description here

After finding the cutoff point, I normally write a function myself to find the number of datapoints that has their prediction value above the cutoff, and match it with the group they belong to.

enter image description here

After finding the cutoff point, I normally write a function myself to find the number of datapoints that has their prediction value above the cutoff, and match it with the group they belong to.

Deciding threshold for glm logistic regression model in R

Tags:

r

logistic-regression

glm

predict

user2175594

1 Answers

Error404

Recent Activity

Donate For Us

Deciding threshold for glm logistic regression model in R

Tags:

r

logistic-regression

glm

predict

user2175594

1 Answers

Error404

Related questions

Recent Activity

Donate For Us