My goal is approximate the distribution of a sum of binomial variables. I use the following paper The Distribution of a Sum of Binomial Random Variables by Ken Butler and Michael Stephens.
I want to write an R script to find Pearson approximation to the sum of binomials. There is an R-package PearsonDS that allows do this in a simple way.
So I take the first example from the paper and try to find density of the Pearson distribution for this case. Finally i get an error message "There are no probability distributions with these moments".
Could you please explain me what's wrong in the below code?
library(PearsonDS)
# define parameters for five binomial random varibles
n<-rep(5,5)
p<-seq(0.02,0.10,0.02)
# find the first four cumulants
k.1<-sum(n*p)
k.2<-sum(n*p*(1-p))
k.3<-sum(n*p*(1-p)*(1-2*p))
k.4<-sum(n*p*(1-p)*(1-6*p*(1-p)))
# find the skewness and kurtosis parameters
beta.1<-k.3^2/k.2^3
beta.2<-k.4/k.2^2
# define the moments and calculate
moments <- c(mean=k.1,variance=k.2,skewness=sqrt(beta.1),kurtosis=beta.2)
dpearson(1:7,moments=moments)
I get the error message "There are no probability distributions with these moments".
What you try to insert as kurtosis in your moments, is actually the excess kurtosis, which is just kurtosis - 3
. From the help-page of dpearson()
:
moments:
optional vector/list of mean, variance, skewness, kurtosis (not excess kurtosis).
So adding 3 to beta.2
will provide you with the real kurtosis:
beta.1 <- (k.3^2)/(k.2^3)
beta.2 <- k.4/(k.2^2)
kurt <- beta.2 + 3
moments <- c(mean = k.1, variance = k.2, skewness = beta.1, kurtosis = kurt)
dpearson(1:7, moments=moments)
# [1] 0.3438773545 0.2788412385 0.1295129534 0.0411140817 0.0099279576
# [6] 0.0019551512 0.0003294087
To get a result like the one in the paper, we should investigate the cumulative distribution function and add 0.5 to correct for the bias caused by approximating a discrete distribution by a continuous one:
ppearson(1:7+0.5, moments = moments)
# [1] 0.5348017 0.8104394 0.9430092 0.9865434 0.9973715 0.9995578 0.9999339
The function threw an error because the relationship between kurtosis and skewness wasn't invalid: kurtosis is lower-bounded by the skewness in the following way: kurtosis >= (skewness)^2 - 1
. The proof ain't pretty and is certainly beyond the scope of the question, but you can check out the references below if you like for different versions of this inequality.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With