I've been surfing but haven't found the correct method to do the following.
I have a histogram done with matplotlib:
hist, bins, patches = plt.hist(distance, bins=100, normed='True')
From the plot, I can see that the distribution is more or less an exponential (Poisson distribution). How can I do the best fitting, taking into account my hist and bins arrays?
UPDATE
I am using the following approach:
x = np.float64(bins) # Had some troubles with data types float128 and float64
hist = np.float64(hist)
myexp=lambda x,l,A:A*np.exp(-l*x)
popt,pcov=opt.curve_fit(myexp,(x[1:]+x[:-1])/2,hist)
But I get
---> 41 plt.plot(stats.expon.pdf(np.arange(len(hist)),popt),'-')
ValueError: operands could not be broadcast together with shapes (100,) (2,)
To fit a histogram with a predefined function, simply pass the name of the function in the first parameter of TH1::Fit . For example, this line fits histogram object hist with a Gaussian.
plt. hist() method is used multiple times to create a figure of three overlapping histograms. we adjust opacity, color, and number of bins as needed. Three different columns from the data frame are taken as data for the histograms.
What you described is a form of exponential distribution, and you want to estimate the parameters of the exponential distribution, given the probability density observed in your data. Instead of using non-linear regression method (which assumes the residue errors are Gaussian distributed), one correct way is arguably a MLE (maximum likelihood estimation).
scipy
provides a large number of continuous distributions in its stats
library, and the MLE is implemented with the .fit()
method. Of course, exponential distribution is there:
In [1]:
import numpy as np
import scipy.stats as ss
import matplotlib.pyplot as plt
%matplotlib inline
In [2]:
#generate data
X = ss.expon.rvs(loc=0.5, scale=1.2, size=1000)
#MLE
P = ss.expon.fit(X)
print P
(0.50046056920696858, 1.1442947648425439)
#not exactly 0.5 and 1.2, due to being a finite sample
In [3]:
#plotting
rX = np.linspace(0,10, 100)
rP = ss.expon.pdf(rX, *P)
#Yup, just unpack P with *P, instead of scale=XX and shape=XX, etc.
In [4]:
#need to plot the normalized histogram with `normed=True`
plt.hist(X, normed=True)
plt.plot(rX, rP)
Out[4]:
Your distance
will replace X
here.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With