Is it possible to generate distributions in R for which the Mean, SD, skew and kurtosis are known? So far it appears the best route would be to create random numbers and transform them accordingly. If there is a package tailored to generating specific distributions which could be adapted, I have not yet found it. Thanks
Base R does not contain a function that will allow you to calculate Skewness in R. We will need to use the package “moments” to get the required function. Skewness is a commonly used measure of the symmetry of a statistical distribution.
We use the skewness function in R with the argument type=2 to obtain skewness based on the moments formula and the kurtosis function with the argument type=2 to obtain kurtosis based on the moments formula. Here we can see that the skewness for the Growth variable is 1.59, indicating a positively skewed distribution.
The default algorithm of the function kurtosis in e1071 is based on the formula g2 = m4∕s4 - 3, where m4 and s are the fourth central moment and sample standard deviation respectively.
There is a Johnson distribution in the SuppDists package. Johnson will give you a distribution that matches either moments or quantiles. Others comments are correct that 4 moments does not a distribution make. But Johnson will certainly try.
Here's an example of fitting a Johnson to some sample data:
require(SuppDists) ## make a weird dist with Kurtosis and Skew a <- rnorm( 5000, 0, 2 ) b <- rnorm( 1000, -2, 4 ) c <- rnorm( 3000, 4, 4 ) babyGotKurtosis <- c( a, b, c ) hist( babyGotKurtosis , freq=FALSE) ## Fit a Johnson distribution to the data ## TODO: Insert Johnson joke here parms<-JohnsonFit(babyGotKurtosis, moment="find") ## Print out the parameters sJohnson(parms) ## add the Johnson function to the histogram plot(function(x)dJohnson(x,parms), -20, 20, add=TRUE, col="red")
The final plot looks like this:
You can see a bit of the issue that others point out about how 4 moments do not fully capture a distribution.
Good luck!
EDIT As Hadley pointed out in the comments, the Johnson fit looks off. I did a quick test and fit the Johnson distribution using moment="quant"
which fits the Johnson distribution using 5 quantiles instead of the 4 moments. The results look much better:
parms<-JohnsonFit(babyGotKurtosis, moment="quant") plot(function(x)dJohnson(x,parms), -20, 20, add=TRUE, col="red")
Which produces the following:
Anyone have any ideas why Johnson seems biased when fit using moments?
This is an interesting question, which doesn't really have a good solution. I presume that even though you don't know the other moments, you have an idea of what the distribution should look like. For example, it's unimodal.
There a few different ways of tackling this problem:
Assume an underlying distribution and match moments. There are many standard R packages for doing this. One downside is that the multivariate generalisation may be unclear.
Saddlepoint approximations. In this paper:
Gillespie, C.S. and Renshaw, E. An improved saddlepoint approximation. Mathematical Biosciences, 2007.
We look at recovering a pdf/pmf when given only the first few moments. We found that this approach works when the skewness isn't too large.
Laguerre expansions:
Mustapha, H. and Dimitrakopoulosa, R. Generalized Laguerre expansions of multivariate probability densities with moments. Computers & Mathematics with Applications, 2010.
The results in this paper seem more promising, but I haven't coded them up.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With