Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fitting a zero inflated poisson distribution in R

Tags:

r

distribution

I have a vector of count data that is strongly over dispersed and zero inflated.

The vector looks like this:

i.vec=c(0,63,1,4,1,44,2,2,1,0,1,0,0,0,0,1,0,0,3,0,0,2,0,0,0,0,0,2,0,0,0,0,
0,0,0,0,0,0,0,0,6,1,11,1,1,0,0,0,2)
m=mean(i.vec)
# 3.040816
sig=sd(i.vec)
# 10.86078

I would like to fit a distribution to this, which I strongly suspect will be a zero inflated poisson (ZIP). But I need to perform a significance test to demonstrate that a ZIP distribution fits the data.

If I had a normal distribution, I could do a chi square goodness of fit test using the function goodfit() in the package vcd, but I don't know of any tests that I can perform for zero inflated data.

like image 734
Laura Avatar asked Aug 23 '11 06:08

Laura


People also ask

When should I use zero-inflated Poisson?

Zero-inflated poisson regression is used to model count data that has an excess of zero counts. Further, theory suggests that the excess zeros are generated by a separate process from the count values and that the excess zeros can be modeled independently.

How do you fit a Poisson distribution?

Fitting a Poisson Distribution to Given Data For a given frequency distribution of a quantity, if the range of that quantity starts from 0 and proceeds to a positive integer, then a Poisson Probability Distribution can be fitted to that data using the parameter = the observed mean frequency of that quantity.

What is a zero-inflated distribution?

Simple definition: • In statistics, a zero-inflated model is a statistical model based on a. zero-inflated probability distribution, i.e. a distribution that allows for frequent zero-valued observations.


1 Answers

Here is one approach

# LOAD LIBRARIES
library(fitdistrplus)    # fits distributions using maximum likelihood
library(gamlss)          # defines pdf, cdf of ZIP


# FIT DISTRIBUTION (mu = mean of poisson, sigma = P(X = 0)
fit_zip = fitdist(i.vec, 'ZIP', start = list(mu = 2, sigma = 0.5))

# VISUALIZE TEST AND COMPUTE GOODNESS OF FIT    
plot(fit_zip)
gofstat(fit_zip, print.test = T)

Based on this, it does not look like ZIP is a good fit.

like image 148
Ramnath Avatar answered Oct 12 '22 23:10

Ramnath