I am doing statistical analysis for a dataset using GLM in R. Basically the predictor variables are: "Probe"(types of probes used in the experiment - Factor with 4 levels), "Extraction"(types of extraction used in the experiment - Factor with 2 levels), "Tank"(the tank number that the sample is collected from - integers from 1 to 9), and "Dilution"(the dilution of each sample - numbers: 3.125, 6.25, 12.5, 25, 50, 100). The response is the number of positive responses ("Positive") obtained from a number of repetition of the experiment ("Rep"). I want to assess the effects of all predictor variables (and their interactions) on the number of positive responses, so I tried to fit a GLM model like this:
y<-cbind(mydata$Positive,mydata$Rep - mydata$Positive)
model1<-glm(y~Probe*Extraction*Dilution*Tank, family=quasibinomial, data=mydata)
But I was later advised by my supervisor that the "Tank" predictor variable should not be treated as a level-based variable. i.e. it has values of 1 to 9, but it's just the tank label so the difference between 1 and, say, 7 is not important. Treating this variable as factor would only make a large model with bad results. So how to treat the "Tank" variable as a random factor and include it in the GLM?
Thanks
It is called a "mixed effect model". Check out the lme4 package.
library(lme4)
glmer(y~Probe + Extraction + Dilution + (1|Tank), family=binomial, data=mydata)
Also, you should probably use + instead of * to add factors. * includes all interactions and levels of each factor, which would lead to a huge overfitting model. Unless you have a specific reason to believe that there is interaction, in which case you should code that interaction explicitly.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With