In a linear mixed model setting, the order I enter my variables into the model (both as random effects and as fixed effects) seemingly affects the estimates I get from the model. In an OLS setting this is not the case.
Could anybody explain why the resulting estimated fixed effects vary when I either change the order in which the fixed effects enter the model or the order of the random effects? Because I fail to see how
lmer(Y ~ X1 + X2 + (1 + X1 + X2 | f) )
differs from
lmer(Y ~ X2 + X1 + (1 + X2 + X1 | f) )
A short example is presented below.
library(lme4)
lmer1 <- lmer(Sepal.Length ~ 1 + Sepal.Width + Petal.Length +
Petal.Width + (1 + Sepal.Width + Petal.Length + Petal.Width |
Species), data=iris)
lmer2 <- lmer(Sepal.Length ~ 1 + Sepal.Width +
Petal.Length + Petal.Width + (1 + Petal.Width + Sepal.Width +
Petal.Length | Species), data=iris)
lmer3 <- lmer(Sepal.Length ~ 1 +
Petal.Width + Sepal.Width + Petal.Length + (1 + Petal.Width +
Sepal.Width + Petal.Length | Species), data=iris)
fixef(lmer1)
fixef(lmer2)
fixef(lmer3)[c("(Intercept)", "Sepal.Width", "Petal.Length", "Petal.Width")]
The output from these three seemingly identical models is presented below:
> fixef(lmer1)
(Intercept) Sepal.Width Petal.Length Petal.Width
1.6707431 0.4711415 0.7266866 -0.2240361
> fixef(lmer2)
(Intercept) Sepal.Width Petal.Length Petal.Width
1.6707432 0.4711417 0.7266866 -0.2240366
> fixef(lmer3)[c("(Intercept)", "Sepal.Width", "Petal.Length", "Petal.Width")]
(Intercept) Sepal.Width Petal.Length Petal.Width
1.6707428 0.4711414 0.7266866 -0.2240358
Now, while the estimated fixed effects are so similar that the difference between them hardly would be of any practical importance, it still makes me wonder why it happens.
As pointed out in a comment to my original post, it turns out that this is a known bug the lme4 package; see https://github.com/lme4/lme4/issues/449. Hopefully this will be resolved in future releases of the the package. My thanks to Dimitris Rizopoulos for providing this information.
Furthermore, it turns out that the order of the rows in the dataset also affects the estimates in a similar way: glmer in R: Significance estimates are not robust to order of data frame
In other words, how we enter variables into lmer and in which order the rows of the dataset are placed, currently affect the estimates. Hopefully these issues can be resolved in the future as it is, in my opinion, not an attractive property for a statistical tool to have.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With