I'm trying to fit a logistic regression using glm( family='binomial')
.
Here is the model:
model<-glm(f_ocur~altitud+UTM_X+UTM_Y+j_sin+j_cos+temp_res+pp,
offset=(log(1/off)), data=mydata, family='binomial')
mydata
has 76820 observations.
The response variable (f_ocur) is 0-1.
This data is a sample of a bigger dataset, so the idea of setting the offset is to account that the data used here represents a sample of the real data to be analysed.
For some reason the offset is not working. When I run this model I get a result, but when I run the same model but without the offset I get the exact same result as the previous model. I was expecting a different result but there is no difference.
Am I doing something wrong? Should the offset be with the linear predictors? like this:
model <- glm(f_ocur~altitud+UTM_X+UTM_Y+j_sin+j_cos+temp_res+pp+offset(log(1/off)),
data=mydata, family='binomial')
Once the model is ready, I´d like to use it with new data. The new data would be the data to validate this model, this data has the same columns. My idea is to use:
validate <- predict(model, newdata=data2, type='response')
And here comes my question, does the predict function takes into consideration the offset used to create the model? If not, what should I do in order to get the correct probabilities for the new data?
A helpful feature of the GLM framework is the “offset” option. An offset is a model variable with a known or pre-specified coefficient. This paper presents several sample applications of offsets in property-casualty modeling applications.
The option to use an offset means we no longer need to rarefy and can use our whole dataset. If we want to plot something very close to relative abundance data, we simply use an offset = 100. There are various packages out there handling 'many' GLMs.
The Binomial Regression model is part of the family of Generalized Linear Models. GLMs are used to model the relationship between the expected value of a response variable y and a linear combination of the explanatory variables vector X.
I think the log offset is used with Poisson family. In case of binomial you should not use log. Check the link https://stats.stackexchange.com/questions/25415/using-offset-in-binomial-model-to-account-for-increased-numbers-of-patients
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With