Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

glmulti runs indefinitely when using genetic algorithm with lme4

I'm using glmulti for model averaging in R. There are ~10 variables in my model, making exhaustive screening impractical - I therefore need to use the genetic algorithm (GA) (call: method = "g").

I need to include random effects so I'm using glmulti as a wrapper for lme4. Methods for doing this are available here http://www.inside-r.org/packages/cran/glmulti/docs/glmulti and there is also a pdf included with the glmulti package that goes into more detail. The problem is that when telling glmulti to use GA in this setting it runs indefinitely, even after the best model has been found.

This is the example taken from the pdf included in the glmulti package:

library(lme4)
library(glmulti)

# create a function for glmulti to act as a wrapper for lmer:
lmer.glmulti <- function (formula, data, random = "", ...) {
lmer(paste(deparse(formula), random), data = data, REML=F, ...)
}

# set some random variables:
y = runif(30,0,10) # mock dependent variable
a = runif(30) # dummy covariate
b = runif(30) # another dummy covariate
c = runif(30) # an another one
x = as.factor(round(runif(30),1))# dummy grouping factor

# run exhaustive screening with lmer:
bab <- glmulti(y~a*b*c, level = 2, fitfunc = lmer.glmulti, random = "+(1|x)")

This works fine. The problem is when I tell it to use the genetic algorithm:

babs <- glmulti(y~a*b*c, level = 2, fitfunc = lmer.glmulti, random = "+(1|x)", method = "g")

It just keeps running indefinitely and the AIC does not change:

...

After 19550 generations:
Best model: y~1
Crit= 161.038899734164
Mean crit= 164.13629335762
Change in best IC: 0 / Change in mean IC: 0

After 19560 generations:
Best model: y~1
Crit= 161.038899734164
Mean crit= 164.13629335762
Change in best IC: 0 / Change in mean IC: 0

After 19570 generations:
Best model: y~1
Crit= 161.038899734164
Mean crit= 164.13629335762

... etc.

I have tried using calls that tell glmulti when to stop (deltaB = 0, deltaM = 0.01, conseq = 6) but nothing seems to work. I think the problem must lie with setting the function (?). It may be something really obvious however I'm new to R and I can't work it out.

Any help with this would be much appreciated.

like image 828
Thomas Avatar asked Sep 01 '12 14:09

Thomas


2 Answers

I received the solution from the package maintainer. The issue is that the number of models explored is set by the argument confsetsize. The default value is 100.

According to ?glmulti, this argument is:

The number of models to be looked for, i.e. the size of the returned confidence set.

The solution is to set confsetsize so that it is less than or equal to the total number of models.

Starting with the example from the OP that did not stop:

babs <- glmulti(y~a*b*c, level = 2, fitfunc = lmer.glmulti, 
                random = "+(1|x)", method = "g")

glmulti will determine the total number of candidate models using method = "d"

babs <- glmulti(y~a*b*c, level = 2, fitfunc = lmer.glmulti, 
                random = "+(1|x)", method = "d")



Initialization...
TASK: Diagnostic of candidate set.
Sample size: 30
0 factor(s).
3 covariate(s).
...
Your candidate set contains 64 models.

Thus, setting confsetsize to less than or equal to 64 will result in the desired behavior.

babs <- glmulti(y~a*b*c, level = 2, fitfunc = lmer.glmulti, 
                random = "+(1|x)", method = "g", confsetsize = 64)

However, for small models it may be sufficient to use the exhaustive search (method = "h"):

babs <- glmulti(y~a*b*c, level = 2, fitfunc = lmer.glmulti, 
                random = "+(1|x)", method = "h")
like image 67
David LeBauer Avatar answered Nov 12 '22 16:11

David LeBauer


Right, I've worked this one out - the problem is that the example (above) I was using to test run this package only contains 3 variables. When you add in a fourth it works fine:

d = runif(30)

And run again telling it to use GA:

babs <- glmulti(y~a*b*c*d, level = 2, fitfunc = lmer.glmulti, random = "+(1|x)", method = "g")

Returns:

...

After 190 generations:
Best model: y~1
Crit= 159.374382952181
Mean crit= 163.380382861026
Improvements in best and average IC have bebingo en below the specified goals.
Algorithm is declared to have converged.
Completed.

Using glmulti out-of-the-box with a GLM gives the same result if you try to use GA with less than three variables. This is not really an issue however as if you've only got three variables it is possible to do an exhaustive search. The problem was the example.

like image 22
Thomas Avatar answered Nov 12 '22 14:11

Thomas