i'm trying to fit a negbin model with sqrt link. Unfortunately it seems to be that I have to specify starting values. Is anybody familiar with setting starting values when running the glm.nb
command (package MASS
)?
When I don't use starting values, I get an error message:
no valid set of coefficients has been found: please supply starting values
Looking at ?glm.nb
it seems to be possible to set starting values, unfortunately I absolutely don't know how to do this. Some further information: 1.When computing the regression with the standard log link, the regression can be estimated. 2. It is not possible to set the start value for the algorithm to an arbitrary value, so for example
glm.nb(<model>,link=sqrt, start=1)
does not work!
Finding suitable starting values can be difficult for sufficiently complex problems. However for setting the starting values (the documentation of which is not great, but exists) you should learn to read the error messages. Here is a replicate of your unsuccessful attempt using start=1
with a built-in data set:
>quine.nb1 <- glm.nb(Days ~ Sex + Age + Eth + Lrn, data = quine,
link=sqrt, start=1)
Error in glm.fitter(x = X, y = Y, w = w, start = start, etastart = etastart, :
length of 'start' should equal 7 and correspond to initial coefs for
c("(Intercept)", "SexM", "AgeF1", "AgeF2", "AgeF3", "EthN", "LrnSL", )
It tells you exactly what it is expecting: a vector of values for each coefficient to be estimated.
quine.nb1 <- glm.nb(Days ~ Sex + Age + Eth + Lrn, data = quine,
link=sqrt, start=rep(1,7))
works, because I gave a vector of length 7. You might have to play around with the actual values in it to get a model that always predicts positive values. It is likely that the default algorithm of generating starting values in glm.nb
gives negative prediction somewhere, and the sqrt
link cannot tolerate that (unlike the log
). If you are having trouble finding valid starting values by hand, you can try running a simpler model, and expand estimates from it by 0's for the other parameters to get a good starting location.
EDIT: building up a model
Suppose you can't find valid starting values for your complicated model. Then start with a simple one, for example
> nb0 <- glm.nb(Days ~ Sex, data=quine, link=sqrt)
> coef(nb0)
(Intercept) SexM
3.9019226 0.3353578
Now let's add the next variable using the previous starting values by adding 0 estimates for the effect of the new variable (in this case Age
has four levels, so needs 3 coefficients):
> nb1 <- glm.nb(Days ~ Sex+Age, data=quine, link=sqrt, start=c(coef(nb0), 0,0,0))
> coef(nb1)
(Intercept) SexM AgeF1 AgeF2 AgeF3
3.9127405 -0.1155013 -0.5551010 0.7475166 0.5933048
You usually want to keep adding 0's and not, say, 100's, because a coefficient of 0 means that the new variable has no effect - which is exactly what the simpler model that you just fitted assumes.
I got similar error while doing the RR regression using log link binomial as shown below
adjrep <-glm(reptest ~ momagecat + paritycat + marstatcat + dept,
family = binomial(link = "log"),
data = hcm1)
> Error: no valid set of coefficients has been found: please supply
> starting values
After following up on the instructions of building up a model I arrived on the code below and got coefficients for each the variables below.
rep3 <-glm (reptest ~ momagecat + paritycat + marstatcat + dept,
family = binomial(link = "log"),
data = hcm1,
start=c(coef(rep1),0,0,0))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With