I am having some trouble to fit a gaussian to data. I think the problem is that most of the elements are close to zero, and there not many points to actually be fitted. But in any case, I think they make a good dataset to fit, and I don't get what is confussing python. Here is the program, I have also added a line to plot the data so you can see what I am trying to fit
#Gaussian function
def gauss_function(x, a, x0, sigma):
return a*np.exp(-(x-x0)**2/(2*sigma**2))
# program
from scipy.optimize import curve_fit
x = np.arange(0,21.,0.2)
# sorry about these data!
y = [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 2.2888599818864958e-275, 1.0099964933708256e-225, 4.9869496866403137e-184, 4.4182929795060327e-149, 7.2953754336628778e-120, 1.6214815763354974e-95, 2.5845990267696154e-75, 1.2195550372375896e-58, 5.6756631456872126e-45, 7.2520963306599953e-34, 6.0926453402093181e-25, 7.1075523112494745e-18, 2.1895584709541657e-12, 3.1040093615952226e-08, 3.2818874974043519e-05, 0.0039462011337049593, 0.077653596114448178, 0.33645159419151383, 0.40139213808285212, 0.15616093582013874, 0.0228751827752081, 0.0014423440677009125, 4.4400754532288282e-05, 7.4939123408714068e-07, 7.698340466102054e-09, 5.2805658851032628e-11, 2.6233358880470556e-13, 1.0131613609937094e-15, 3.234727006243684e-18, 9.0031014316344088e-21, 2.2867065482392331e-23, 5.5126221075296919e-26, 1.3045106781768978e-28, 3.1185031969890313e-31, 7.7170036365830092e-34, 2.0179753504732056e-36, 5.6739187799428708e-39, 1.7403776988666581e-41, 5.8939645426573027e-44, 2.2255784749636281e-46, 9.4448944519959299e-49, 4.5331936383388069e-51, 2.4727435506007072e-53, 1.5385048936078214e-55, 1.094651071873419e-57, 8.9211199390945735e-60, 8.3347561634783632e-62, 8.928140776588251e-64, 1.0960564546383266e-65, 1.5406342485015278e-67, 2.4760905399114866e-69, 4.5423744881977258e-71, 9.4921949220625905e-73, 2.2543765002199549e-74, 6.0698995872666723e-76, 1.8478996852922248e-77, 6.3431644488676084e-79, 0.0, 0.0, 0.0, 0.0]
plot(x,y) #Plot the curve, the gaussian is quite clear
plot(x,y,'ok') #Overplot the dots
# Try to fit the result
popt, pcov = curve_fit(gauss_function, x, y)
The problem is that the results for popt is
print popt
array([ 7.39717176e-10, 1.00000000e+00, 1.00000000e+00])
Any hint on why this could be happening?
Thanks!
Returns poptarray. Optimal values for the parameters so that the sum of the squared residuals of f(xdata, *popt) - ydata is minimized. pcov2-D array. The estimated covariance of popt. The diagonals provide the variance of the parameter estimate.
The SciPy open source library provides the curve_fit() function for curve fitting via nonlinear least squares. The function takes the same input and output data as arguments, as well as the name of the mapping function to use. The mapping function must take examples of input data and some number of arguments.
Your problem is with the initial parameters of the curve_fit. By default, if no other information is given, it will start with an array of 1, but this obviously lead to a radically wrong result. This can be corrected simply by giving a reasonable starting vector. To do this, I start from the estimated mean and standard deviation of your dataset
#estimate mean and standard deviation
meam = sum(x * y)
sigma = sum(y * (x - m)**2)
#do the fit!
popt, pcov = curve_fit(gauss_function, x, y, p0 = [1, mean, sigma])
#plot the fit results
plot(x,gauss_function(x, *popt))
#confront with the given data
plot(x,y,'ok')
This will perfectly approximate your results. Remember that curve fitting in general cannot work unless you start from a good point (inside the convergence basin, to be clear), and this doesn't depend on the implementation. Never do blind fit when you can use your knowledge!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With