Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Different versions of R, lme4 and OS X give different fixed-effects significance results in glmer

I am running a logit mixed-effects model using glmer() in package lme4. The experiment used a within-subjects within-items design with Subjects and Items as crossed random effects.

My problem: different versions of R and lme4 (run on different OS X) produce different standard errors estimates for the fixed effects, and consequently, different significance results.

Here is a subset of my data (data from the last two subjects):

structure(list(SubjN = c(87L, 87L, 87L, 87L, 87L, 87L, 87L, 87L, 
87L, 87L, 87L, 87L, 87L, 87L, 87L, 87L, 87L, 87L, 87L, 87L, 87L, 
87L, 87L, 87L, 88L, 88L, 88L, 88L, 88L, 88L, 88L, 88L, 88L, 88L, 
88L, 88L, 88L, 88L, 88L, 88L, 88L, 88L, 88L, 88L, 88L, 88L, 88L, 
88L), Items = structure(c(3L, 10L, 11L, 5L, 1L, 12L, 2L, 6L, 
9L, 6L, 3L, 4L, 8L, 11L, 12L, 7L, 8L, 2L, 7L, 10L, 9L, 5L, 1L, 
4L, 10L, 3L, 5L, 11L, 12L, 1L, 2L, 6L, 9L, 6L, 3L, 4L, 8L, 11L, 
12L, 7L, 2L, 8L, 10L, 7L, 9L, 5L, 1L, 4L), .Label = c("a", "c", 
"k", "f", "g", "i", "d", "l", "e", "j", "b", "h"), class = "factor"), 
IV1 = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), .Label = c("N", "L", "P"
), class = "factor"), DV = c(0, 0, 1, 0, 1, 0, 1, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 1, 
0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), 
IV1.h = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), contrasts = structure(c(-1, 
0.5, 0.5, 0, -0.5, 0.5), .Dim = c(3L, 2L), .Dimnames = list(
    c("N", "L", "P"), c("N_vs_L&P", "L_vs_P"))), .Label = c("N", 
"L", "P"), class = "factor"), N_vs_LP = c(-1, -1, -1, -1, 
-1, -1, -1, -1, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 
0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, -1, -1, -1, -1, -1, -1, 
-1, -1, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 
0.5, 0.5, 0.5, 0.5, 0.5, 0.5), L_vs_P = c(0, 0, 0, 0, 0, 
0, 0, 0, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, 
0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0, 0, 0, 0, 0, 0, 
0, 0, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, 0.5, 
0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5)), .Names = c("SubjN", 
"Items", "IV1", "DV", "IV1.h", "N_vs_LP", "L_vs_P"), row.names = c("3099", 
"3100", "3101", "3102", "3103", "3104", "3119", "3120", "3107", 
"3108", "3109", "3110", "3097", "3098", "3105", "3106", "3115", 
"3116", "3117", "3118", "3111", "3112", "3113", "3114", "3147", 
"3148", "3149", "3150", "3151", "3152", "3167", "3168", "3155", 
"3156", "3157", "3158", "3145", "3146", "3153", "3154", "3163", 
"3164", "3165", "3166", "3159", "3160", "3161", "3162"), class = "data.frame")

Each subject was tested on 24 trials on 3 different conditions (factor IV1, levels: N, L, P). I recorded whether they produced a target linguistic structure (DV == 1) or not (DV == 0). In the analysis, I only included those subjects who produced the target structure at least one. Nonetheless, most of them produced the target structure only on very few occasion. This is the proportion of DV == 1 produced by each subject in each condition:

library(plyr)
#dput(ddply(mydata, .(SubjN, IV1), summarise, l = length(DV), y = round(mean(DV),2)))

structure(list(SubjN = c(1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 
4L, 4L, 4L, 5L, 5L, 5L, 6L, 6L, 6L, 7L, 7L, 7L, 8L, 8L, 8L, 9L, 
9L, 9L, 10L, 10L, 10L, 11L, 11L, 11L, 12L, 12L, 12L, 13L, 13L, 
13L, 14L, 14L, 14L, 15L, 15L, 15L, 16L, 16L, 16L, 17L, 17L, 17L, 
18L, 18L, 18L, 19L, 19L, 19L, 20L, 20L, 20L, 21L, 21L, 21L, 22L, 
22L, 22L, 23L, 23L, 23L, 24L, 24L, 24L, 25L, 25L, 25L, 26L, 26L, 
26L, 27L, 27L, 27L, 28L, 28L, 28L, 29L, 29L, 29L, 30L, 30L, 30L, 
31L, 31L, 31L, 32L, 32L, 32L, 33L, 33L, 33L, 34L, 34L, 34L, 35L, 
35L, 35L, 36L, 36L, 36L, 37L, 37L, 37L, 38L, 38L, 38L, 39L, 39L, 
39L, 40L, 40L, 40L, 41L, 41L, 41L, 42L, 42L, 42L, 43L, 43L, 43L, 
44L, 44L, 44L, 45L, 45L, 45L, 46L, 46L, 46L, 47L, 47L, 47L, 48L, 
48L, 48L, 49L, 49L, 49L, 50L, 50L, 50L, 51L, 51L, 51L, 52L, 52L, 
52L, 53L, 53L, 53L, 54L, 54L, 54L, 55L, 55L, 55L, 56L, 56L, 56L, 
57L, 57L, 57L, 58L, 58L, 58L, 59L, 59L, 59L, 60L, 60L, 60L, 61L, 
61L, 61L, 62L, 62L, 62L, 63L, 63L, 63L, 64L, 64L, 64L, 65L, 65L, 
65L, 66L, 66L, 66L, 67L, 67L, 67L, 68L, 68L, 68L, 69L, 69L, 69L, 
70L, 70L, 70L, 71L, 71L, 71L, 72L, 72L, 72L, 73L, 73L, 73L, 74L, 
74L, 74L, 75L, 75L, 75L, 76L, 76L, 76L, 77L, 77L, 77L, 78L, 78L, 
78L, 79L, 79L, 79L, 80L, 80L, 80L, 81L, 81L, 81L, 82L, 82L, 82L, 
83L, 83L, 83L, 84L, 84L, 84L, 85L, 85L, 85L, 86L, 86L, 86L, 87L, 
87L, 87L, 88L, 88L, 88L), IV1 = structure(c(1L, 2L, 3L, 1L, 2L, 
3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 
1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 
2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 
3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 
1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 
2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L,      
3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 
1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 
2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 
3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 
1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 
2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 
3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 
1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 
2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 
3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 
1L, 2L, 3L), .Label = c("N", "L", "P"), class = "factor"), l = c(8L, 
8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 
8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 
8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 
8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 7L, 8L, 7L, 8L, 8L, 
8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 
8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 7L, 8L, 8L, 8L, 8L, 8L, 8L, 
7L, 8L, 6L, 7L, 7L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 
8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 
8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 7L, 8L, 8L, 7L, 7L, 8L, 7L, 8L, 
8L, 7L, 8L, 8L, 7L, 8L, 8L, 7L, 8L, 8L, 7L, 8L, 8L, 7L, 8L, 8L, 
8L, 8L, 8L, 8L, 8L, 8L, 8L, 6L, 8L, 4L, 8L, 8L, 8L, 8L, 8L, 8L, 
8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 7L, 8L, 8L, 8L, 8L, 
8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 
8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 7L, 
8L, 8L, 7L, 8L, 8L, 7L, 8L, 8L, 7L, 8L, 8L, 7L, 8L, 8L, 7L, 8L, 
8L, 7L, 8L, 8L, 7L, 8L, 8L, 7L, 8L, 8L, 7L, 8L, 7L, 8L, 8L, 8L, 
8L, 8L, 8L, 8L, 8L, 8L, 8L), y = c(1, 0.88, 1, 0.5, 0.25, 0.62, 
0, 0, 0.25, 0, 0.25, 0, 0.12, 0, 0, 0, 0.12, 0, 0, 0.12, 0.12, 
0, 0, 0.12, 0.38, 0, 0.25, 0, 0.12, 0, 0.12, 0, 0.25, 0, 0, 0.12, 
0.5, 0.25, 0.5, 0, 0, 0.12, 0, 0.25, 0.12, 0, 0, 0.12, 0, 0.12, 
0, 0, 0.12, 0.12, 0.12, 0.62, 0, 0, 0.5, 0.25, 1, 0.88, 1, 0, 
0, 0.12, 0, 0.12, 0.12, 0.12, 0.12, 0, 0.62, 0.62, 0.38, 0.5, 
0.88, 0.12, 0.12, 0, 0, 0.12, 0.12, 0, 0, 0.12, 0, 0, 0.12, 0, 
0, 0.12, 0, 0, 0.25, 0, 0, 0.14, 0, 0.5, 0.57, 0.29, 0, 0.12, 
0, 0, 0.12, 0, 0.25, 0.5, 0.25, 0, 0.12, 0.12, 0.25, 0, 0.38, 
0, 0, 0.12, 0, 0, 1, 0.25, 0.12, 0.25, 0, 0.12, 0.12, 0, 0, 0.12, 
0, 0, 0.12, 0.12, 0, 0, 0.12, 0, 0.14, 0.14, 0.12, 0, 0.12, 0, 
0, 0.12, 0.12, 0, 1, 0.88, 1, 0, 0.12, 0, 0.12, 0, 0, 0.12, 0, 
0.12, 0, 0, 0.12, 0.12, 0.12, 0.12, 1, 1, 1, 0.12, 0, 0, 0.12, 
0.38, 0, 0, 0.12, 0, 0, 0, 0.5, 0.5, 0, 0.25, 0, 0.12, 0.29, 
0, 0, 0.38, 0, 0, 0.62, 0.5, 0, 0.12, 0, 0.12, 0.12, 0.25, 0.12, 
0.25, 0.12, 0, 0.12, 0, 0, 0.12, 0, 0, 0.12, 0, 0.12, 0.12, 0, 
0.12, 0.12, 0, 0, 0.12, 0.12, 0.12, 0, 0.38, 0.12, 0.57, 0, 0.12, 
0, 0, 0.12, 0, 0, 0.12, 0, 0, 0.12, 0.14, 0.88, 0.88, 0.86, 0, 
0, 0.14, 0, 0.12, 0.14, 0, 0.12, 0, 0, 0, 0.12, 0, 0, 0.12, 0.38, 
0, 0, 0.5, 0.12, 0)), .Names = c("SubjN", "IV1", "l", "y"), row.names = c(NA, 
-264L), class = "data.frame")

I run the following model including IV1 as fixed effect with helmert-contrast coding; first contrast: N vs. L & P, second contrast: L vs. P.

m1 <- glmer(DV ~ IV1.h + (1 + IV1.h|SubjN) +  (1|Items) + (0 + N_vs_LP|Items) + (0 + L_vs_P|Items), family ='binomial', mydata)

The model does not allow for the correlation between the by-Items random variables (I did this by creating separate slopes for the two contrasts), since when correlation was allowed they were perfectly correlated (which I interpreted as a sign of over-parametrization).

1) Results using os x 10.8.5 mountain lion R version 3.0.2 (2013-09-25) lme4_1.0-5 (the original analysis I run)

Generalized linear mixed model fit by maximum likelihood ['glmerMod']
 Family: binomial ( logit )
Formula: DV ~ IV1.h + (1 + N_vs_LP + L_vs_P | SubjN) + (1 | Items) + (0 + N_vs_LP | Items)     + (0 + L_vs_P | Items) 
   Data: mydata 

      AIC       BIC    logLik  deviance 
1492.5408 1560.2050 -734.2704 1468.5408 

Random effects:
 Groups  Name          Variance  Std.Dev. Corr       
 SubjN    (Intercept)   2.3885505 1.54549             
          N_vs_LP       0.4394195 0.66289  -0.69      
          L_vs_P        1.9287559 1.38880   0.04  0.08
 Items    (Intercept)   0.0531518 0.23055
 Items.1  N_vs_LP       0.0001950 0.01396
 Items.2  L_vs_P        0.0003619 0.01902             

Number of obs: 2077, groups: SubjN, 88; Items, 12

Fixed effects:
                              Estimate Std. Error z value Pr(>|z|)    
(Intercept)                    -2.2998     0.1964 -11.710  < 2e-16 ***
IV1.hN_vs_L&P                   0.3704     0.1378   2.689  0.00717 ** 
IV1.hL_vs_P                     0.2060     0.2320   0.888  0.37459    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Correlation of Fixed Effects:
              (Intr) IV1.N_
IV1.hN_vs_L&P -0.388       
IV1.hL_vs_P    0.014  0.019

2) Results using: OS X 10.9.4 Mavericks R version 3.1.1 (2014-07-10) lme4_1.1-7 optimizer 'bobyqa'

Generalized linear mixed model fit by maximum likelihood (Laplace Approximation)     ['glmerMod']
 Family: binomial  ( logit )
Formula: DV ~ IV1.h + (1 + N_vs_LP + L_vs_P | SubjN) + (1 | Items) + (0 +  
    N_vs_LP | Items) + (0 + L_vs_P | Items)
   Data: mydata
Control: glmerControl(optimizer = "bobyqa")

     AIC      BIC   logLik deviance df.resid 
  1492.5   1560.2   -734.3   1468.5     2065 

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-2.4174 -0.3364 -0.2595 -0.1706  4.6028 

Random effects:
 Groups  Name        Variance Std.Dev. Corr       
 SubjN   (Intercept) 2.38791  1.5453              
         N_vs_LP     0.43935  0.6628   -0.69      
         L_vs_P      1.92629  1.3879    0.04  0.07
 Items   (Intercept) 0.05319  0.2306              
 Items.1 N_vs_LP     0.00000  0.0000              
 Items.2 L_vs_P      0.00000  0.0000              
Number of obs: 2077, groups:  SubjN, 88; Items, 12

Fixed effects:
              Estimate Std. Error z value Pr(>|z|)    
(Intercept)    -2.2998     0.2095 -10.975   <2e-16 ***
IV1.hN_vs_L&P   0.3703     0.1892   1.958   0.0503 .  
IV1.hL_vs_P     0.2063     0.2679   0.770   0.4413    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Correlation of Fixed Effects:
            (Intr) IV1.N_
IV1.hN__L&P -0.379       
IV1.hL_vs_P -0.001  0.003

I really don't know which outcome I should trust. Any help would be very much appreciated.

Ps. Sorry if something is not clear - it's my first post :)

Thanks very much!

like image 841
exfalso Avatar asked Sep 06 '14 11:09

exfalso


1 Answers

From lme4's NEWS file, for version 1.1-4

Standard errors of fixed effects are now computed from the approximate Hessian by default (see the use.hessian argument in vcov.merMod); this gives better (correct) answers when the estimates of the random- and fixed-effect parameters are correlated (Github #47)

The description of the problem is here

You should be able to retrieve the old standard errors from the newer (1.1-7) model by sqrt(diag(vcov(fitted_model,use.hessian=FALSE))), but the new version is more likely to be correct.

For more precise confidence intervals/p values, you can do a likelihood ratio test (use anova to compare nested models) and/or compute the profile confidence intervals with confint(fitted_model,which="beta_").

like image 133
Ben Bolker Avatar answered Sep 30 '22 17:09

Ben Bolker