Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

glm.nb with sqrt link

Tags:

r

statistics

i'm trying to fit a negbin model with sqrt link. Unfortunately it seems to be that I have to specify starting values. Is anybody familiar with setting starting values when running the glm.nb command (package MASS)?

When I don't use starting values, I get an error message:

no valid set of coefficients has been found: please supply starting values

Looking at ?glm.nb it seems to be possible to set starting values, unfortunately I absolutely don't know how to do this. Some further information: 1.When computing the regression with the standard log link, the regression can be estimated. 2. It is not possible to set the start value for the algorithm to an arbitrary value, so for example

glm.nb(<model>,link=sqrt, start=1)

does not work!

like image 301
user734124 Avatar asked Feb 25 '23 04:02

user734124


2 Answers

Finding suitable starting values can be difficult for sufficiently complex problems. However for setting the starting values (the documentation of which is not great, but exists) you should learn to read the error messages. Here is a replicate of your unsuccessful attempt using start=1 with a built-in data set:

>quine.nb1 <- glm.nb(Days ~ Sex + Age + Eth + Lrn, data = quine, 
                    link=sqrt, start=1)
Error in glm.fitter(x = X, y = Y, w = w, start = start, etastart = etastart,  : 
  length of 'start' should equal 7 and correspond to initial coefs for 
  c("(Intercept)", "SexM", "AgeF1", "AgeF2", "AgeF3", "EthN", "LrnSL", )

It tells you exactly what it is expecting: a vector of values for each coefficient to be estimated.

quine.nb1 <- glm.nb(Days ~ Sex + Age + Eth + Lrn, data = quine, 
                    link=sqrt, start=rep(1,7))

works, because I gave a vector of length 7. You might have to play around with the actual values in it to get a model that always predicts positive values. It is likely that the default algorithm of generating starting values in glm.nb gives negative prediction somewhere, and the sqrt link cannot tolerate that (unlike the log). If you are having trouble finding valid starting values by hand, you can try running a simpler model, and expand estimates from it by 0's for the other parameters to get a good starting location.

EDIT: building up a model

Suppose you can't find valid starting values for your complicated model. Then start with a simple one, for example

> nb0 <- glm.nb(Days ~ Sex, data=quine, link=sqrt)
> coef(nb0)
(Intercept)        SexM 
  3.9019226   0.3353578 

Now let's add the next variable using the previous starting values by adding 0 estimates for the effect of the new variable (in this case Age has four levels, so needs 3 coefficients):

> nb1 <- glm.nb(Days ~ Sex+Age, data=quine, link=sqrt, start=c(coef(nb0), 0,0,0))
> coef(nb1)
(Intercept)        SexM       AgeF1       AgeF2       AgeF3 
  3.9127405  -0.1155013  -0.5551010   0.7475166   0.5933048 

You usually want to keep adding 0's and not, say, 100's, because a coefficient of 0 means that the new variable has no effect - which is exactly what the simpler model that you just fitted assumes.

like image 111
Aniko Avatar answered Feb 26 '23 20:02

Aniko


I got similar error while doing the RR regression using log link binomial as shown below

adjrep <-glm(reptest ~ momagecat + paritycat + marstatcat + dept,
             family = binomial(link = "log"),
             data = hcm1)

> Error: no valid set of coefficients has been found: please supply
> starting values

After following up on the instructions of building up a model I arrived on the code below and got coefficients for each the variables below.

rep3 <-glm (reptest ~ momagecat + paritycat + marstatcat + dept,
            family = binomial(link = "log"),
            data = hcm1,
            start=c(coef(rep1),0,0,0))
like image 45
user11314548 Avatar answered Feb 26 '23 21:02

user11314548