Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using python scipy to fit gamma distribution to data

I want to fit a gamma distribution to my data, which I do using this

import scipy.stats as ss
import scipy as sp
import numpy as np
import os
import matplotlib.pyplot as plt

alpha = []
beta = []
loc = []

data = np.loadtxt(data)
fit_alpha, fit_loc, fit_beta = ss.gamma.fit(data, floc=0, fscale=1)

I want to keep one of the parameters to the gamma distribution as a variable (say the shape), and fix one of the parameters (say scale=1). However, if I keep the loc variable as zero, I am not able to fix the scale at one. Is there some workaround for this? Can I not parametrize the gamma distribution using only the shape and scale?

like image 375
Sahil M Avatar asked Sep 09 '13 16:09

Sahil M


People also ask

How do you fit gamma distribution to data?

To fit the gamma distribution to data and find parameter estimates, use gamfit , fitdist , or mle . Unlike gamfit and mle , which return parameter estimates, fitdist returns the fitted probability distribution object GammaDistribution . The object properties a and b store the parameter estimates.

What is gamma in Scipy?

A gamma continuous random variable. As an instance of the rv_continuous class, gamma object inherits from it a collection of generic methods (see below for the full list), and completes them with details specific for this particular distribution. See also erlang , expon. Notes.


2 Answers

In a comment I said you have run into a bug in the gamma distribution--it does not let you fix both the location and the scale. The bug was fixed in scipy 0.13, but if you can't upgrade, you can work around the bug by using the fit method of the class rv_continuous, which is the parent class of gamma:

In [22]: from scipy.stats import rv_continuous, gamma

In [23]: x = gamma.rvs(2.5, loc=0, scale=4, size=1000)  # A test sample.

In [24]: rv_continuous.fit(gamma, x, floc=0, fscale=4)
Out[24]: (2.5335837650122608, 0, 4)
like image 173
Warren Weckesser Avatar answered Oct 20 '22 09:10

Warren Weckesser


Looking at the implementation of gamma.fit:

def fit(self, data, *args, **kwds):
    floc = kwds.get('floc', None)
    if floc == 0:
        xbar = ravel(data).mean()
        logx_bar = ravel(log(data)).mean()
        s = log(xbar) - logx_bar
        def func(a):
            return log(a) - special.digamma(a) - s
        aest = (3-s + math.sqrt((s-3)**2 + 24*s)) / (12*s)
        xa = aest*(1-0.4)
        xb = aest*(1+0.4)
        a = optimize.brentq(func, xa, xb, disp=0)
        scale = xbar / a
        return a, floc, scale
    else:
        return super(gamma_gen, self).fit(data, *args, **kwds)

If you put floc=None, it will call the fit function of the parent class (which is rv_continuous) and you can fix the scale.

like image 27
jr-be Avatar answered Oct 20 '22 10:10

jr-be