I have a sample data and I want to get the best fit distribution. I have got couple of links which suggest that I can import the distributions from scipy.stats
, but then I am not aware of the type of data before hand. I want something similar to allfitdist()
in MATLAB
which tries to fit data to around 20 distributions and returns the best fit.
Link for allfitdist()
: http://www.mathworks.in/matlabcentral/fileexchange/34943-fit-all-valid-parametric-probability-distributions-to-data
Any help is highly appreciable. Thanks.
You can just create a list of all available distributions in scipy. An example with two distributions and random data:
import numpy as np
import scipy.stats as st
data = np.random.random(10000)
distributions = [st.laplace, st.norm]
mles = []
for distribution in distributions:
pars = distribution.fit(data)
mle = distribution.nnlf(pars, data)
mles.append(mle)
results = [(distribution.name, mle) for distribution, mle in zip(distributions, mles)]
best_fit = sorted(zip(distributions, mles), key=lambda d: d[1])[0]
print 'Best fit reached using {}, MLE value: {}'.format(best_fit[0].name, best_fit[1])
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With