I've tryed both the approaches but I find difficulties with both.. I try to explain better which is my problem before I tell you my questions with the two approaches.
I have the dataset "acceptances" in which I have the number of acceptances for everyday in an hospital with the indipendent variables described before. The hospital has three places in which we do the visits..So in my dataset I have 3 rows per day one for every place. The dataset seems like:
Date Place NumerAccept weekday month NoConvention Rain
2008-01-02 Place1 203 wed Gen 0 1
2008-01-02 Place2 70 wed Gen 0 1
2008-01-02 Place3 9 wed Gen 0 1
2008-01-03 Place1 345 thu Gen 0 1
2008-01-03 Place2 24 thu Gen 0 1
2008-01-03 Place3 99 thu Gen 0 1
2008-01-04 Place1 339 fri Gen 0 0
2008-01-04 Place2 36 fri Gen 0 0
2008-01-04 Place3 101 fri Gen 0 0
.... and so on... I have the dataset until yesterday, so the last three rows are the acceptances of yesterday 29 of July 2013. Now I do my Poisson regression:
poisson_reg=glm(NumeberAccept ~ 1 + weekday + month + place + NoConvention + Rain,
family = poisson(link = log), data = acceptances)
Now for my predictions I create a new dataset acceptances_2 from which I want to calculate the prediction interval for the Number of Acceptances for the next 2 months!! So the first row will be the number of acceptations today, and the last row will be the acceptances on September 29.
I don't know if this question has already an answer, but I wasn't able to find it. I'm trying to do a Poisson regression in R and I want to obtain the prediction intervals. I saw that the predict function for lm
gives it writing 'interval="prediction"'
but it doesn't work with predict.glm
!
Does someone know if there is a way to have those prediction intervals?? Can you type the code if you have some examples?
So I have to count the number of everyday acceptances in an hospital and I have the following code:
poisson_reg=glm(NumeberAccept ~ 1 + weekday + month + place + NoConvention + Rain,
family = poisson(link = log), data = dataset)
summary(poisson_reg)
Now if I type in R predict(poisson_reg, newdata, type="responce")
I have the prediction for the number of acceptances for everyday but I need the prediction intervals too!
I saw that for an object of class "lm"
in the predict call you can write: predict(poisson_reg, newdata, interval="prediction")
and it gives the 95% prediction intervals. Is there a way to obtain the same with an object of class "glm"
?
This is perhaps more of a statistical question than a programming question, but:
Stealing example data from a previous question:
ex <- read.table(
header=TRUE, text='
Number.Accepted Weekday Month Place
20 6 8 1
16 7 8 1
12 4 8 2
11 7 1 1
12 1 4 1
12 7 10 2
13 5 6 2
')
ex.glm <- glm(Number.Accepted ~ Weekday + Month + Place,
family = poisson, data = ex)
Data frame for which we want prediction intervals:
newdata <- data.frame(Weekday=c(5,6),Month=c(9,9),Place=c(1,1))
Something like this:
bootSimFun <- function(preddata,fit,data) {
bdat <- data[sample(seq(nrow(data)),size=nrow(data),replace=TRUE),]
bfit <- update(fit,data=bdat)
bpred <- predict(bfit,type="response",newdata=preddata)
rpois(length(bpred),lambda=bpred)
}
You can also use replicate()
from base R, but plyr::raply()
is convenient:
library(plyr)
set.seed(101)
simvals <- raply(500,bootSimFun(preddata=newdata,fit=ex.glm,data=ex))
t(apply(simvals,2,quantile,c(0.025,0.975)))
## 2.5% 97.5%
## 1 7.000 40
## 2 7.475 36
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With