Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fit data to all possible distributions and return the best fit [closed]

I have a sample data and I want to get the best fit distribution. I have got couple of links which suggest that I can import the distributions from scipy.stats, but then I am not aware of the type of data before hand. I want something similar to allfitdist() in MATLAB which tries to fit data to around 20 distributions and returns the best fit.

Link for allfitdist(): http://www.mathworks.in/matlabcentral/fileexchange/34943-fit-all-valid-parametric-probability-distributions-to-data

Any help is highly appreciable. Thanks.

like image 614
mvsrs Avatar asked Feb 07 '14 09:02

mvsrs


1 Answers

You can just create a list of all available distributions in scipy. An example with two distributions and random data:

import numpy as np
import scipy.stats as st


data = np.random.random(10000)
distributions = [st.laplace, st.norm]
mles = []

for distribution in distributions:
    pars = distribution.fit(data)
    mle = distribution.nnlf(pars, data)
    mles.append(mle)

results = [(distribution.name, mle) for distribution, mle in zip(distributions, mles)]
best_fit = sorted(zip(distributions, mles), key=lambda d: d[1])[0]
print 'Best fit reached using {}, MLE value: {}'.format(best_fit[0].name, best_fit[1])
like image 154
Martin Avatar answered Oct 22 '22 10:10

Martin