Please see answer from Ben Bolker 16/05/2016 for the appropriate solution. OP below.
I am fitting several multilevel models with lme4. I would like to report the variances and covariances of the random effects, and to automate this process.
I know I can get the variances with as.data.frame(VarCorr(mymodel))
, and I know that I can get the confidence intervals with confint(mymodel)
. Clearly I can merge/rbind the two tables and put the confidence intervals around the variances by simply squaring the output of confint()
at the appropriate rows and columns, but I have not been able to find a convincing method to calculate the covariances if not by hand.
Say the result of confint
is:
conf <- NULL
a <- c(6.2,-0.4,2.2,1.5,-0.4,-0.5,2.8,-0.9,1.3,3.9)
b <- c(6.8,-0.2,2.5,2.5,0.1,0.2,4.8,-0.7,2.3,5)
conf <- data.frame(a,b,row.names = c("sd_(Intercept)|ID","cor_Time.(Intercept)|ID","sd_Time|ID","sd_(Intercept)|Group","cor_Time.(Intercept)|Group","cor_I(Time^2).(Intercept)|Group","sd_Time|Group","cor_I(Time^2).Time|Group","sd_I(Time^2)|Group","sigma"))
colnames(conf) <- c("2.5%","97.5%")
conf
How can I automate the various multiplications to get the covariances such as
cov.time.intercept <- conf[1,2]*conf[1,1]*conf[1,3]
?
I have tried splitting standard deviations and correlations, creating "ID", "Time", "I(Time^2)" and "(Intercept)" variables and then matching by two columns, but I am not getting anywhere. The issue is that each time the model changes you might have different numbers of variances and covariances, and different triangular matrices.
Thank you for any help,
k.
Your computation seems to give plausible answers, but it doesn't make sense (to me; I stand ready to be corrected/enlightened ...). Suppose cov = corr*var1*var2
. Suppose that ci(.)
is the (lower or upper) confidence limit for a quantity. It is by no means true that ci(cov) = ci(corr)*ci(var1)*ci(var2)
(it is interesting that you get reasonable answers; I think this is most likely to happen when the quantities are approximately uncorrelated ...) If you had the variances of each component and the covariances among them (I don't mean the random-effect variances and covariances themselves, but their sampling variances/covariances) you could propagate them approximately using the delta method, but these are fairly hard to get (see here).
The "right" way to do this, as far as I can tell, is do the likelihood profile computations on the variance-covariance scale instead of the standard deviation-correlation scale. This wasn't previously possible, but it is now (with the development version on Github).
Install latest version:
library(remotes) ## for install_github (or library(devtools))
install_github("lme4/lme4")
Preliminaries:
chap12 <- foreign::read.dta(file = "ch12.dta")
library(lme4)
snijders <- lmer(prox_pup ~ 1 + prox_sel + (1 + occ|teacher),
data = chap12)
as.data.frame(VarCorr(snijders))
## grp var1 var2 vcov sdcor
## 1 teacher (Intercept) <NA> 0.15617962 0.3951957
## 2 teacher occ <NA> 0.01205317 0.1097869
## 3 teacher (Intercept) occ -0.03883458 -0.8950676
## 4 Residual <NA> <NA> 0.04979762 0.2231538
We have to be a bit careful when comparing results because profile.merMod
, which we'll use shortly, automatically (and silently!) converts fits from the default REML to maximum likelihood fits (because a profile based on REML might be statistically dicey); however, it doesn't look like this makes a huge difference.
s2 <- refitML(snijders)
as.data.frame(VarCorr(s2))
## grp var1 var2 vcov sdcor
## 1 teacher (Intercept) <NA> 0.15426049 0.3927601
## 2 teacher occ <NA> 0.01202631 0.1096645
## 3 teacher (Intercept) occ -0.03884427 -0.9018483
## 4 Residual <NA> <NA> 0.04955549 0.2226106
p.sd <- profile(s2,which="theta_",
signames=FALSE)
p.vcov <- profile(s2,which="theta_",prof.scale="varcov",
signames=FALSE)
We get some warnings about non-monotonic profiles ...
confint(p.vcov)
## 2.5 % 97.5 %
## var_(Intercept)|teacher 0.08888931 0.26131067
## cov_occ.(Intercept)|teacher -0.07553263 -0.01589043
## var_occ|teacher 0.00000000 0.02783863
## sigma 0.03463184 0.07258777
What if we check against the square of the relevant (sd/variance) elements?
confint(p.sd)[c(1,3,4),]^2
## 2.5 % 97.5 %
## sd_(Intercept)|teacher 0.089089363 0.26130970
## sd_occ|teacher 0.002467408 0.02779329
## sigma 0.034631759 0.07263869
These match pretty well, except for the lower bound of the occ
variance; they also match your results above. However, the covariance result (which is the one that I claim is difficult) gives (-0.0755,-0.0159) for me, vs. (-0.0588,-0.0148) for you, about a 20% difference. This might not be a big deal, depending on what you're trying to do.
Let's try brute force too:
sumfun <- function(x) {
vv <- as.data.frame(VarCorr(x),order="lower.tri")[,"vcov"]
## cheating a bit here, using internal lme4 naming functions ...
return(setNames(vv,
c(lme4:::tnames(x,old=FALSE,prefix=c("var","cov")),
"sigmasq")))
}
cc <- confint(s2,method="boot",nsim=1000,FUN=sumfun,seed=101,
.progress="txt", PBargs=list(style=3))
## .progress/PBargs just cosmetic ...
## 2.5 % 97.5 %
## var_(Intercept)|teacher 0.079429623 0.24053633
## cov_occ.(Intercept)|teacher -0.067063911 -0.01479572
## var_occ|teacher 0.002733402 0.02378310
## sigmasq 0.031952508 0.06736664
The "gold standard" here seems to be partway between my profile result and your result: lower bound on covariance is -0.067 here vs. -0.0755 (profile) or -0.0588.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With