random.gauss(mu, sigma)
Above is a function allowing to randomly draw a number from a normal distribution with a given mean and variance. But how can we draw values from a normal distribution defined by more than only the two first moments?
something like:
random.gauss(mu, sigma, skew, kurtosis)
To calculate the sample skewness and sample kurtosis of this dataset, we can use the skew() and kurt() functions from the Scipy Stata librarywith the following syntax: skew(array of values, bias=False) kurt(array of values, bias=False)
The normal distribution has a skewness of zero and kurtosis of three. The test is based on the difference between the data's skewness and zero and the data's kurtosis and three. The test rejects the hypothesis of normality when the p-value is less than or equal to 0.05.
A general guideline for skewness is that if the number is greater than +1 or lower than –1, this is an indication of a substantially skewed distribution. For kurtosis, the general guideline is that if the number is greater than +1, the distribution is too peaked.
The formula given in most textbooks is Skew = 3 * (Mean – Median) / Standard Deviation.
How about using scipy? You can pick the distribution you want from continuous distributions in the scipy.stats library.
The generalized gamma function has non-zero skew and kurtosis, but you'll have a little work to do to figure out what parameters to use to specify the distribution to get a particular mean, variance, skew and kurtosis. Here's some code to get you started.
import scipy.stats
import matplotlib.pyplot as plt
distribution = scipy.stats.norm(loc=100,scale=5)
sample = distribution.rvs(size=10000)
plt.hist(sample)
plt.show()
print distribution.stats('mvsk')
This displays a histogram of a 10,000 element sample from a normal distribution with mean 100 and variance 25, and prints the distribution's statistics:
(array(100.0), array(25.0), array(0.0), array(0.0))
Replacing the normal distribution with the generalized gamma distribution,
distribution = scipy.stats.gengamma(100, 70, loc=50, scale=10)
you get the statistics [mean, variance, skew, kurtosis]
(array(60.67925117494595), array(0.00023388203873597746), array(-0.09588807605341435), array(-0.028177799805207737))
.
Try to use this:
http://statsmodels.sourceforge.net/devel/generated/statsmodels.sandbox.distributions.extras.pdf_mvsk.html#statsmodels.sandbox.distributions.extras.pdf_mvsk
Return the Gaussian expanded pdf function given the list of 1st, 2nd moment and skew and Fisher (excess) kurtosis.
Parameters : mvsk : list of mu, mc2, skew, kurt
Looks good to me. There's a link to the source on that page.
Oh, and here's the other StackOverflow question that pointed me there: Apply kurtosis to a distribution in python
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With