Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Offset not working in binomial GLM

Tags:

r

glm

I'm trying to fit a logistic regression using glm( family='binomial').

Here is the model:

model<-glm(f_ocur~altitud+UTM_X+UTM_Y+j_sin+j_cos+temp_res+pp, 
           offset=(log(1/off)), data=mydata, family='binomial')

mydata has 76820 observations. The response variable (f_ocur) is 0-1.
This data is a sample of a bigger dataset, so the idea of setting the offset is to account that the data used here represents a sample of the real data to be analysed.

For some reason the offset is not working. When I run this model I get a result, but when I run the same model but without the offset I get the exact same result as the previous model. I was expecting a different result but there is no difference.

Am I doing something wrong? Should the offset be with the linear predictors? like this:

model <- glm(f_ocur~altitud+UTM_X+UTM_Y+j_sin+j_cos+temp_res+pp+offset(log(1/off)), 
             data=mydata, family='binomial')

Once the model is ready, I´d like to use it with new data. The new data would be the data to validate this model, this data has the same columns. My idea is to use:

validate <- predict(model, newdata=data2, type='response')

And here comes my question, does the predict function takes into consideration the offset used to create the model? If not, what should I do in order to get the correct probabilities for the new data?

like image 777
lpchaparro Avatar asked Nov 05 '12 18:11

lpchaparro


People also ask

What is offset in GLM?

A helpful feature of the GLM framework is the “offset” option. An offset is a model variable with a known or pre-specified coefficient. This paper presents several sample applications of offsets in property-casualty modeling applications.

Why do we use offset in GLM?

The option to use an offset means we no longer need to rarefy and can use our whole dataset. If we want to plot something very close to relative abundance data, we simply use an offset = 100. There are various packages out there handling 'many' GLMs.

What is a binomial GLM?

The Binomial Regression model is part of the family of Generalized Linear Models. GLMs are used to model the relationship between the expected value of a response variable y and a linear combination of the explanatory variables vector X.


1 Answers

I think the log offset is used with Poisson family. In case of binomial you should not use log. Check the link https://stats.stackexchange.com/questions/25415/using-offset-in-binomial-model-to-account-for-increased-numbers-of-patients

like image 70
Aybek Khodiev Avatar answered Nov 15 '22 04:11

Aybek Khodiev