Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Classification accuracy of binomial glmer() predictions

Tags:

r

prediction

lme4

I've been busting my (non-r-savy) brains on a way to get R to produce the percentage of correct predictions for a binomial glmer model. I know this is not super informative statistically, but it is often reported; so I would like to report it as well.

DATA:

Dependent variable: Tipo, which has 2 values: 's' or 'p'. Bunch of factor predictors, not a single continuous variable. 2 random intercepts: the test subject, and the nouns s/he responded to

code used for the model:

model <- glmer(Tipo ~ agency + tense + 
               co2pr + pr2pr + socialclass + 
               (1|muestra) + (1|nouns), 
               data=datafile, family="binomial",
               control=glmerControl(optimizer="bobyqa"), 
               contrasts=c("sum", "poly"))

I know there is a function predict() which takes a model object and formulates predictions based upon that model, but I can't seem to make it work for me. I would appreciate if you would be willing to share the code.

Thanks in advance.

like image 911
Jeroen Claes Avatar asked Feb 12 '15 14:02

Jeroen Claes


1 Answers

In order to make predictions, you need a threshold (there is a whole literature [search for "ROC curve" or "AUC"] on this topic ...) Naively picking a 0.5 cutoff (which is a reasonable default if you don't know or want to assume anything about the relative cost of false positives vs. false negatives, or equivalently the value of sensitivity vs. specificity), then

p <- as.numeric(predict(model, type="response")>0.5)

should give predicted probabilities and convert them to 0 or 1 respectively. Then

mean(p==datafile$Tipo)

should give you the proportion correct.

table(p,datafile$Tipo)

should give you a predicted-vs-observed table.

like image 185
Ben Bolker Avatar answered Sep 28 '22 07:09

Ben Bolker