Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

gaussian fit with scipy.optimize.curve_fit in python with wrong results

I am having some trouble to fit a gaussian to data. I think the problem is that most of the elements are close to zero, and there not many points to actually be fitted. But in any case, I think they make a good dataset to fit, and I don't get what is confussing python. Here is the program, I have also added a line to plot the data so you can see what I am trying to fit

#Gaussian function
def gauss_function(x, a, x0, sigma):
    return a*np.exp(-(x-x0)**2/(2*sigma**2))

# program
from scipy.optimize import curve_fit
x = np.arange(0,21.,0.2)
# sorry about these data!
y = [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 2.2888599818864958e-275, 1.0099964933708256e-225, 4.9869496866403137e-184, 4.4182929795060327e-149, 7.2953754336628778e-120, 1.6214815763354974e-95, 2.5845990267696154e-75, 1.2195550372375896e-58, 5.6756631456872126e-45, 7.2520963306599953e-34, 6.0926453402093181e-25, 7.1075523112494745e-18, 2.1895584709541657e-12, 3.1040093615952226e-08, 3.2818874974043519e-05, 0.0039462011337049593, 0.077653596114448178, 0.33645159419151383, 0.40139213808285212, 0.15616093582013874, 0.0228751827752081, 0.0014423440677009125, 4.4400754532288282e-05, 7.4939123408714068e-07, 7.698340466102054e-09, 5.2805658851032628e-11, 2.6233358880470556e-13, 1.0131613609937094e-15, 3.234727006243684e-18, 9.0031014316344088e-21, 2.2867065482392331e-23, 5.5126221075296919e-26, 1.3045106781768978e-28, 3.1185031969890313e-31, 7.7170036365830092e-34, 2.0179753504732056e-36, 5.6739187799428708e-39, 1.7403776988666581e-41, 5.8939645426573027e-44, 2.2255784749636281e-46, 9.4448944519959299e-49, 4.5331936383388069e-51, 2.4727435506007072e-53, 1.5385048936078214e-55, 1.094651071873419e-57, 8.9211199390945735e-60, 8.3347561634783632e-62, 8.928140776588251e-64, 1.0960564546383266e-65, 1.5406342485015278e-67, 2.4760905399114866e-69, 4.5423744881977258e-71, 9.4921949220625905e-73, 2.2543765002199549e-74, 6.0698995872666723e-76, 1.8478996852922248e-77, 6.3431644488676084e-79, 0.0, 0.0, 0.0, 0.0]

plot(x,y) #Plot the curve, the gaussian is quite clear
plot(x,y,'ok') #Overplot the dots

# Try to fit the result
popt, pcov = curve_fit(gauss_function, x, y)

The problem is that the results for popt is

print popt
array([  7.39717176e-10,   1.00000000e+00,   1.00000000e+00])

Any hint on why this could be happening?

Thanks!

like image 561
Álvaro Avatar asked Jan 22 '13 13:01

Álvaro


People also ask

What does Scipy optimize curve_fit return?

Returns poptarray. Optimal values for the parameters so that the sum of the squared residuals of f(xdata, *popt) - ydata is minimized. pcov2-D array. The estimated covariance of popt. The diagonals provide the variance of the parameter estimate.

How does Scipy curve_fit work?

The SciPy open source library provides the curve_fit() function for curve fitting via nonlinear least squares. The function takes the same input and output data as arguments, as well as the name of the mapping function to use. The mapping function must take examples of input data and some number of arguments.


1 Answers

Your problem is with the initial parameters of the curve_fit. By default, if no other information is given, it will start with an array of 1, but this obviously lead to a radically wrong result. This can be corrected simply by giving a reasonable starting vector. To do this, I start from the estimated mean and standard deviation of your dataset

#estimate mean and standard deviation
meam = sum(x * y)
sigma = sum(y * (x - m)**2)
#do the fit!
popt, pcov = curve_fit(gauss_function, x, y, p0 = [1, mean, sigma])
#plot the fit results
plot(x,gauss_function(x, *popt))
#confront with the given data
plot(x,y,'ok')

This will perfectly approximate your results. Remember that curve fitting in general cannot work unless you start from a good point (inside the convergence basin, to be clear), and this doesn't depend on the implementation. Never do blind fit when you can use your knowledge!

like image 137
EnricoGiampieri Avatar answered Sep 28 '22 07:09

EnricoGiampieri