Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

lmer: predictions on population level trigger an error

I want to use linear mixed model and make predictions on population level (i.e. using only fixed effects and using 0 instead of random effects).

Example model:

require(lme4)

fm1 <- lmer(Reaction ~ Days + (Days | Subject), sleepstudy)
summary(fm1)
# values for prediction:
newx <- seq(min(sleepstudy$Days), max(sleepstudy$Days))

I tried several methods of the prediction on population level but they all failed:

pred <- predict(fm1, newdata = data.frame(Days = newx), allow.new.levels = TRUE)
# Error: couldn't evaluate grouping factor Subject within model frame: try adding grouping factor to data frame explicitly if possible

pred <- predict(fm1, newdata = data.frame(Days = newx, Subject = NA), allow.new.levels = TRUE)
# Error: Invalid grouping factor specification, Subject

pred <- predict(fm1, newdata = data.frame(Days = newx, Subject = as.factor(NA)), allow.new.levels = TRUE)
# Error: Invalid grouping factor specification, Subject

I tried to find the manual for the proper prediction method, but I don't know how? I tried to look at help(package = "lme4") and the closest function I found was predict.merMod (though the class of model fm1 is lmerMod not merMod). ?predict.merMod reads:

allow.new.levels (logical) if FALSE (default), then any new levels (or NA values) detected in newdata will trigger an error; if TRUE, then the prediction will use the unconditional (population-level) values for data with previously unobserved levels (or NAs)

It specifically says "or NAs", but it apparently doesn't work that way!!

  1. Am I looking at the help page of a proper method? If not, what is the right method?
  2. How to make the prediction work on the population level?
like image 437
Tomas Avatar asked Dec 29 '15 14:12

Tomas


Video Answer


1 Answers

You're looking for re.form:

re.form: formula for random effects to condition on. If ‘NULL’, include all random effects; if ‘NA’ or ‘~0’, include no random effects

require(lme4)
fm1 <- lmer(Reaction ~ Days + (Days | Subject), sleepstudy)
newx <- seq(min(sleepstudy$Days), max(sleepstudy$Days))
predict(fm1, newdata=data.frame(Days=newx), re.form=NA)
##        1        2        3        4        5        6        7        8 
## 251.4051 261.8724 272.3397 282.8070 293.2742 303.7415 314.2088 324.6761 
##        9       10 
## 335.1434 345.6107 

As for your other questions:

  • merMod is a "super-class" that includes both linear (lmerMod) and generalized linear (glmerMod) models: see ?"merMod-class"
  • your second two tries probably should have worked; however, allow.new.levels was designed for cases with occasional NA values, not all NA values ... predict(fm1, newdata = data.frame(Days = newx, Subject = "a"), allow.new.levels = TRUE) does work. It looks like the code detects an all-NA column and interprets it as something having gone wrong upstream - this could be fixed in the code, but doesn't seem very high-priority since re.form exists.
like image 153
Ben Bolker Avatar answered Nov 03 '22 01:11

Ben Bolker