Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

lme4 calculate confidence intervals of covariances

Please see answer from Ben Bolker 16/05/2016 for the appropriate solution. OP below.


I am fitting several multilevel models with lme4. I would like to report the variances and covariances of the random effects, and to automate this process.

I know I can get the variances with as.data.frame(VarCorr(mymodel)), and I know that I can get the confidence intervals with confint(mymodel). Clearly I can merge/rbind the two tables and put the confidence intervals around the variances by simply squaring the output of confint() at the appropriate rows and columns, but I have not been able to find a convincing method to calculate the covariances if not by hand.

Say the result of confint is:

conf <- NULL
a <- c(6.2,-0.4,2.2,1.5,-0.4,-0.5,2.8,-0.9,1.3,3.9)
b <- c(6.8,-0.2,2.5,2.5,0.1,0.2,4.8,-0.7,2.3,5)
conf <- data.frame(a,b,row.names = c("sd_(Intercept)|ID","cor_Time.(Intercept)|ID","sd_Time|ID","sd_(Intercept)|Group","cor_Time.(Intercept)|Group","cor_I(Time^2).(Intercept)|Group","sd_Time|Group","cor_I(Time^2).Time|Group","sd_I(Time^2)|Group","sigma"))
colnames(conf) <- c("2.5%","97.5%")
conf

How can I automate the various multiplications to get the covariances such as

cov.time.intercept <- conf[1,2]*conf[1,1]*conf[1,3]

?

I have tried splitting standard deviations and correlations, creating "ID", "Time", "I(Time^2)" and "(Intercept)" variables and then matching by two columns, but I am not getting anywhere. The issue is that each time the model changes you might have different numbers of variances and covariances, and different triangular matrices.

Thank you for any help,

k.

like image 213
r.kaiza Avatar asked May 13 '16 11:05

r.kaiza


1 Answers

Your computation seems to give plausible answers, but it doesn't make sense (to me; I stand ready to be corrected/enlightened ...). Suppose cov = corr*var1*var2. Suppose that ci(.) is the (lower or upper) confidence limit for a quantity. It is by no means true that ci(cov) = ci(corr)*ci(var1)*ci(var2) (it is interesting that you get reasonable answers; I think this is most likely to happen when the quantities are approximately uncorrelated ...) If you had the variances of each component and the covariances among them (I don't mean the random-effect variances and covariances themselves, but their sampling variances/covariances) you could propagate them approximately using the delta method, but these are fairly hard to get (see here).

The "right" way to do this, as far as I can tell, is do the likelihood profile computations on the variance-covariance scale instead of the standard deviation-correlation scale. This wasn't previously possible, but it is now (with the development version on Github).

Install latest version:

library(remotes) ## for install_github (or library(devtools))
install_github("lme4/lme4")

Preliminaries:

chap12 <- foreign::read.dta(file = "ch12.dta")
library(lme4)
snijders <- lmer(prox_pup ~ 1 + prox_sel + (1 + occ|teacher),
                 data = chap12)

as.data.frame(VarCorr(snijders))
##        grp        var1 var2        vcov      sdcor
## 1  teacher (Intercept) <NA>  0.15617962  0.3951957
## 2  teacher         occ <NA>  0.01205317  0.1097869
## 3  teacher (Intercept)  occ -0.03883458 -0.8950676
## 4 Residual        <NA> <NA>  0.04979762  0.2231538

We have to be a bit careful when comparing results because profile.merMod, which we'll use shortly, automatically (and silently!) converts fits from the default REML to maximum likelihood fits (because a profile based on REML might be statistically dicey); however, it doesn't look like this makes a huge difference.

s2 <- refitML(snijders)
as.data.frame(VarCorr(s2))
##        grp        var1 var2        vcov      sdcor
## 1  teacher (Intercept) <NA>  0.15426049  0.3927601
## 2  teacher         occ <NA>  0.01202631  0.1096645
## 3  teacher (Intercept)  occ -0.03884427 -0.9018483
## 4 Residual        <NA> <NA>  0.04955549  0.2226106

p.sd <- profile(s2,which="theta_",
              signames=FALSE)
p.vcov <- profile(s2,which="theta_",prof.scale="varcov",
              signames=FALSE)

We get some warnings about non-monotonic profiles ...

confint(p.vcov)
##                                    2.5 %     97.5 %
## var_(Intercept)|teacher      0.08888931  0.26131067
## cov_occ.(Intercept)|teacher -0.07553263 -0.01589043
## var_occ|teacher              0.00000000  0.02783863
## sigma                        0.03463184  0.07258777

What if we check against the square of the relevant (sd/variance) elements?

confint(p.sd)[c(1,3,4),]^2
##                              2.5 %     97.5 %
## sd_(Intercept)|teacher 0.089089363 0.26130970
## sd_occ|teacher         0.002467408 0.02779329
## sigma                  0.034631759 0.07263869

These match pretty well, except for the lower bound of the occ variance; they also match your results above. However, the covariance result (which is the one that I claim is difficult) gives (-0.0755,-0.0159) for me, vs. (-0.0588,-0.0148) for you, about a 20% difference. This might not be a big deal, depending on what you're trying to do.

Let's try brute force too:

sumfun <- function(x) {
    vv <- as.data.frame(VarCorr(x),order="lower.tri")[,"vcov"]
    ## cheating a bit here, using internal lme4 naming functions ...
    return(setNames(vv,
       c(lme4:::tnames(x,old=FALSE,prefix=c("var","cov")),
         "sigmasq")))
}

cc <- confint(s2,method="boot",nsim=1000,FUN=sumfun,seed=101,
        .progress="txt", PBargs=list(style=3))
## .progress/PBargs just cosmetic ...

##                                    2.5 %      97.5 %
## var_(Intercept)|teacher      0.079429623  0.24053633
## cov_occ.(Intercept)|teacher -0.067063911 -0.01479572
## var_occ|teacher              0.002733402  0.02378310
## sigmasq                      0.031952508  0.06736664

The "gold standard" here seems to be partway between my profile result and your result: lower bound on covariance is -0.067 here vs. -0.0755 (profile) or -0.0588.

like image 118
Ben Bolker Avatar answered Nov 02 '22 09:11

Ben Bolker