Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to generate a distribution with a given mean, variance, skew and kurtosis in Python?

random.gauss(mu, sigma)

Above is a function allowing to randomly draw a number from a normal distribution with a given mean and variance. But how can we draw values from a normal distribution defined by more than only the two first moments?

something like:

random.gauss(mu, sigma, skew, kurtosis)

like image 906
Remi.b Avatar asked Oct 26 '13 07:10

Remi.b


People also ask

How do you plot skewness and kurtosis in Python?

To calculate the sample skewness and sample kurtosis of this dataset, we can use the skew() and kurt() functions from the Scipy Stata librarywith the following syntax: skew(array of values, bias=False) kurt(array of values, bias=False)

How do you know if data is normally distributed with skewness and kurtosis?

The normal distribution has a skewness of zero and kurtosis of three. The test is based on the difference between the data's skewness and zero and the data's kurtosis and three. The test rejects the hypothesis of normality when the p-value is less than or equal to 0.05.

How do you interpret skewness and kurtosis to evaluate the distribution of a variable?

A general guideline for skewness is that if the number is greater than +1 or lower than –1, this is an indication of a substantially skewed distribution. For kurtosis, the general guideline is that if the number is greater than +1, the distribution is too peaked.

How do you find the distribution of a skew?

The formula given in most textbooks is Skew = 3 * (Mean – Median) / Standard Deviation.


2 Answers

How about using scipy? You can pick the distribution you want from continuous distributions in the scipy.stats library.

The generalized gamma function has non-zero skew and kurtosis, but you'll have a little work to do to figure out what parameters to use to specify the distribution to get a particular mean, variance, skew and kurtosis. Here's some code to get you started.

import scipy.stats
import matplotlib.pyplot as plt
distribution = scipy.stats.norm(loc=100,scale=5)
sample = distribution.rvs(size=10000)
plt.hist(sample)
plt.show()
print distribution.stats('mvsk')

This displays a histogram of a 10,000 element sample from a normal distribution with mean 100 and variance 25, and prints the distribution's statistics:

(array(100.0), array(25.0), array(0.0), array(0.0))

Replacing the normal distribution with the generalized gamma distribution,

distribution = scipy.stats.gengamma(100, 70, loc=50, scale=10)

you get the statistics [mean, variance, skew, kurtosis] (array(60.67925117494595), array(0.00023388203873597746), array(-0.09588807605341435), array(-0.028177799805207737)).

like image 76
Bennett Brown Avatar answered Sep 28 '22 04:09

Bennett Brown


Try to use this:

http://statsmodels.sourceforge.net/devel/generated/statsmodels.sandbox.distributions.extras.pdf_mvsk.html#statsmodels.sandbox.distributions.extras.pdf_mvsk

Return the Gaussian expanded pdf function given the list of 1st, 2nd moment and skew and Fisher (excess) kurtosis.

Parameters : mvsk : list of mu, mc2, skew, kurt

Looks good to me. There's a link to the source on that page.

Oh, and here's the other StackOverflow question that pointed me there: Apply kurtosis to a distribution in python

like image 39
sdanzig Avatar answered Sep 28 '22 05:09

sdanzig