Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Prediction intervals for poisson regression on R

I've tryed both the approaches but I find difficulties with both.. I try to explain better which is my problem before I tell you my questions with the two approaches.

I have the dataset "acceptances" in which I have the number of acceptances for everyday in an hospital with the indipendent variables described before. The hospital has three places in which we do the visits..So in my dataset I have 3 rows per day one for every place. The dataset seems like:

Date        Place    NumerAccept    weekday month   NoConvention    Rain

2008-01-02  Place1        203       wed     Gen         0             1
2008-01-02  Place2         70       wed     Gen         0             1
2008-01-02  Place3          9       wed     Gen         0             1
2008-01-03  Place1        345       thu     Gen         0             1
2008-01-03  Place2         24       thu     Gen         0             1
2008-01-03  Place3         99       thu     Gen         0             1
2008-01-04  Place1        339       fri     Gen         0             0
2008-01-04  Place2         36       fri     Gen         0             0
2008-01-04  Place3        101       fri     Gen         0             0

.... and so on... I have the dataset until yesterday, so the last three rows are the acceptances of yesterday 29 of July 2013. Now I do my Poisson regression:

poisson_reg=glm(NumeberAccept ~ 1 + weekday + month + place + NoConvention + Rain, 
                    family = poisson(link = log), data = acceptances)

Now for my predictions I create a new dataset acceptances_2 from which I want to calculate the prediction interval for the Number of Acceptances for the next 2 months!! So the first row will be the number of acceptations today, and the last row will be the acceptances on September 29.


I don't know if this question has already an answer, but I wasn't able to find it. I'm trying to do a Poisson regression in R and I want to obtain the prediction intervals. I saw that the predict function for lm gives it writing 'interval="prediction"' but it doesn't work with predict.glm!

Does someone know if there is a way to have those prediction intervals?? Can you type the code if you have some examples?

So I have to count the number of everyday acceptances in an hospital and I have the following code:

poisson_reg=glm(NumeberAccept ~ 1 + weekday + month + place + NoConvention + Rain, 
                family = poisson(link = log), data = dataset)
summary(poisson_reg)

Now if I type in R predict(poisson_reg, newdata, type="responce") I have the prediction for the number of acceptances for everyday but I need the prediction intervals too! I saw that for an object of class "lm" in the predict call you can write: predict(poisson_reg, newdata, interval="prediction") and it gives the 95% prediction intervals. Is there a way to obtain the same with an object of class "glm"?

like image 688
klair Avatar asked Feb 16 '23 18:02

klair


1 Answers

This is perhaps more of a statistical question than a programming question, but:

Stealing example data from a previous question:

ex <- read.table(
  header=TRUE, text='
Number.Accepted  Weekday    Month   Place
  20    6   8   1
  16    7   8   1
  12    4   8   2
  11    7   1   1
  12    1   4   1
  12    7   10  2
  13    5   6   2
')
ex.glm <- glm(Number.Accepted ~ Weekday + Month + Place,
              family = poisson, data = ex)

Data frame for which we want prediction intervals:

newdata <- data.frame(Weekday=c(5,6),Month=c(9,9),Place=c(1,1))

Something like this:

bootSimFun <- function(preddata,fit,data) {
    bdat <- data[sample(seq(nrow(data)),size=nrow(data),replace=TRUE),]
    bfit <- update(fit,data=bdat)
    bpred <- predict(bfit,type="response",newdata=preddata)
    rpois(length(bpred),lambda=bpred)
}

You can also use replicate() from base R, but plyr::raply() is convenient:

library(plyr)
set.seed(101)
simvals <- raply(500,bootSimFun(preddata=newdata,fit=ex.glm,data=ex))
t(apply(simvals,2,quantile,c(0.025,0.975)))
##    2.5% 97.5%
## 1 7.000    40
## 2 7.475    36
like image 54
Ben Bolker Avatar answered Feb 18 '23 11:02

Ben Bolker