I have a data surface that I'm fitting using SciPy's leastsq
function.
I would like to have some estimate of the quality of the fit after leastsq
returns. I'd expected that this would be included as a return from the function, but, if so, it doesn't seem to be clearly documented.
Is there such a return or, barring that, some function I can pass my data and the returned parameter values and fit function to that will give me an estimate of fit quality (R^2 or some such)?
Thanks!
leastsq. Minimize the sum of squares of a set of equations. Should take at least one (possibly length N vector) argument and returns M floating point numbers.
The SciPy open source library provides the curve_fit() function for curve fitting via nonlinear least squares. The function takes the same input and output data as arguments, as well as the name of the mapping function to use. The mapping function must take examples of input data and some number of arguments.
What does popt and pcov mean? popt- An array of optimal values for the parameters which minimizes the sum of squares of residuals. pcov-2d array which contains the estimated covariance of popt. The diagonals provide the variance of the parameter estimate.
The SciPy API provides a 'curve_fit' function in its optimization library to fit the data with a given function. This method applies non-linear least squares to fit the data and extract the optimal parameters out of it.
If you call leastsq
like this:
import scipy.optimize
p,cov,infodict,mesg,ier = optimize.leastsq(
residuals,a_guess,args=(x,y),full_output=True)
where
def residuals(a,x,y):
return y-f(x,a)
then, using the definition of R^2
given here,
ss_err=(infodict['fvec']**2).sum()
ss_tot=((y-y.mean())**2).sum()
rsquared=1-(ss_err/ss_tot)
What is infodict['fvec']
you ask? It's the array of residuals:
In [48]: optimize.leastsq?
...
infodict -- a dictionary of optional outputs with the keys:
'fvec' : the function evaluated at the output
For example:
import scipy.optimize as optimize
import numpy as np
import collections
import matplotlib.pyplot as plt
x = np.array([821,576,473,377,326])
y = np.array([255,235,208,166,157])
def sigmoid(p,x):
x0,y0,c,k=p
y = c / (1 + np.exp(-k*(x-x0))) + y0
return y
def residuals(p,x,y):
return y - sigmoid(p,x)
Param=collections.namedtuple('Param','x0 y0 c k')
p_guess=Param(x0=600,y0=200,c=100,k=0.01)
p,cov,infodict,mesg,ier = optimize.leastsq(
residuals,p_guess,args=(x,y),full_output=True)
p=Param(*p)
xp = np.linspace(100, 1600, 1500)
print('''\
x0 = {p.x0}
y0 = {p.y0}
c = {p.c}
k = {p.k}
'''.format(p=p))
pxp=sigmoid(p,xp)
# You could compute the residuals this way:
resid=residuals(p,x,y)
print(resid)
# [ 0.76205302 -2.010142 2.60265297 -3.02849144 1.6739274 ]
# But you don't have to compute `resid` -- `infodict['fvec']` already
# contains the info.
print(infodict['fvec'])
# [ 0.76205302 -2.010142 2.60265297 -3.02849144 1.6739274 ]
ss_err=(infodict['fvec']**2).sum()
ss_tot=((y-y.mean())**2).sum()
rsquared=1-(ss_err/ss_tot)
print(rsquared)
# 0.996768131959
plt.plot(x, y, '.', xp, pxp, '-')
plt.xlim(100,1000)
plt.ylim(130,270)
plt.xlabel('x')
plt.ylabel('y',rotation='horizontal')
plt.grid(True)
plt.show()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With