I am currently testing whether I should include certain random effects in my lmer model or not. I use the anova function for that. My procedure so far is to fit the model with a function call to lmer()
with REML=TRUE
(the default option). Then I call anova()
on the two models where one of them does include the random effect to be tested for and the other one doees not. However, it is well known that the anova()
function refits the model with ML but in the new version of anova()
you can prevent anova()
from doing so by setting the option refit=FALSE
. In order to test for random effects should I set refit=FALSE
in my call to anova() or not?
(If I do set refit=FALSE
the p-values tend to be lower. Are the p-values anti-conservative when I set refit=FALSE
?)
Method 1:
mod0_reml <- lmer(x ~ y + z + (1 | w), data=dat)
mod1_reml <- lmer(x ~ y + z + (y | w), data=dat)
anova(mod0_reml, mod1_reml)
This will result in anova()
refitting the models with ML
instead of REML
. (Newer versions of the anova()
function will also output an info about this.)
Method 2:
mod0_reml <- lmer(x ~ y + z + (1 | w), data=dat)
mod1_reml <- lmer(x ~ y + z + (y | w), data=dat)
anova(mod0_reml, mod1_reml, refit=FALSE)
This will result in anova()
performing its calculations on the original models, i.e. with REML=TRUE
.
Which of the two methods is correct in order to test whether I should include a random effect or not?
Thanks for any help
To do this, you compare the log-likelihoods of models with and without the appropriate random effect - if removing the random effect causes a large enough drop in log-likelihood then one can say the effect is statistically significant.
1) REML = FALSE is used in case of comparing models with different “Fixed effects” (during the simplification of model) 2) REML = TRUE is used in case of different random effects on the comparing models. I would appreciate for the explanation on the correct use of these functions. R Statistical Package.
To cite the author: "It's generally good to use REML, if it is available, when you are interested in the magnitude of the random effects variances, but never when you are comparing models with different fixed effects via hypothesis tests or information-theoretic criteria such as AIC."
Maximum likelihood or restricted maximum likelihood (REML) estimates of the pa- rameters in linear mixed-effects models can be determined using the lmer function in the lme4 package for R.
In general I would say that it would be appropriate to use refit=FALSE
in this case, but let's go ahead and try a simulation experiment.
First fit a model without a random slope to the sleepstudy
data set, then simulate data from this model:
library(lme4)
mod0 <- lmer(Reaction ~ Days + (1|Subject), data=sleepstudy)
## also fit the full model for later use
mod1 <- lmer(Reaction ~ Days + (Days|Subject), data=sleepstudy)
set.seed(101)
simdat <- simulate(mod0,1000)
Now refit the null data with the full and the reduced model, and save the distribution of p-values generated by anova()
with and without refit=FALSE
. This is essentially a parametric bootstrap test of the null hypothesis; we want to see if it has the appropriate characteristics (i.e., uniform distribution of p-values).
sumfun <- function(x) {
m0 <- refit(mod0,x)
m1 <- refit(mod1,x)
a_refit <- suppressMessages(anova(m0,m1)["m1","Pr(>Chisq)"])
a_no_refit <- anova(m0,m1,refit=FALSE)["m1","Pr(>Chisq)"]
c(refit=a_refit,no_refit=a_no_refit)
}
I like plyr::laply
for its convenience, although you could just as easily use a for
loop or one of the other *apply
approaches.
library(plyr)
pdist <- laply(simdat,sumfun,.progress="text")
library(ggplot2); theme_set(theme_bw())
library(reshape2)
ggplot(melt(pdist),aes(x=value,fill=Var2))+
geom_histogram(aes(y=..density..),
alpha=0.5,position="identity",binwidth=0.02)+
geom_hline(yintercept=1,lty=2)
ggsave("nullhist.png",height=4,width=5)
Type I error rate for alpha=0.05:
colMeans(pdist<0.05)
## refit no_refit
## 0.021 0.026
You can see that in this case the two procedures give practically the same answer and both procedures are strongly conservative, for well-known reasons having to do with the fact that the null value of the hypothesis test is on the boundary of its feasible space. For the specific case of testing a single simple random effect, halving the p-value gives an appropriate answer (see Pinheiro and Bates 2000 and others); this actually appears to give reasonable answers here, although it is not really justified because here we are dropping two random-effects parameters (the random effect of slope and the correlation between the slope and intercept random effects):
colMeans(pdist/2<0.05)
## refit no_refit
## 0.051 0.055
Other points:
PBmodcomp
function from the pbkrtest
package.RLRsim
package is designed precisely for fast randomization (parameteric bootstrap) tests of null hypotheses about random effects terms, but doesn't appear to work in this slightly more complex situationchi^2_0
(point mass at 0) and a chi^2_n
distribution (where n
is probably 2, but I wouldn't be sure for this geometry)If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With