I have a question about the fit algorithms used in scipy. In my program, I have a set of x and y data points with y errors only, and want to fit a function
f(x) = (a[0] - a[1])/(1+np.exp(x-a[2])/a[3]) + a[1]
to it.
The problem is that I get absurdly high errors on the parameters and also different values and errors for the fit parameters using the two fit scipy fit routines scipy.odr.ODR (with least squares algorithm) and scipy.optimize. I'll give my example:
Fit with scipy.odr.ODR, fit_type=2
Beta: [ 11.96765963 68.98892582 100.20926023 0.60793377]
Beta Std Error: [ 4.67560801e-01 3.37133614e+00 8.06031988e+04 4.90014367e+04]
Beta Covariance: [[ 3.49790629e-02 1.14441187e-02 -1.92963671e+02 1.17312104e+02]
[ 1.14441187e-02 1.81859542e+00 -5.93424196e+03 3.60765567e+03]
[ -1.92963671e+02 -5.93424196e+03 1.03952883e+09 -6.31965068e+08]
[ 1.17312104e+02 3.60765567e+03 -6.31965068e+08 3.84193143e+08]]
Residual Variance: 6.24982731975
Inverse Condition #: 1.61472215874e-08
Reason(s) for Halting:
Sum of squares convergence
and then the fit with scipy.optimize.leastsquares:
Fit with scipy.optimize.leastsq
beta: [ 11.9671859 68.98445306 99.43252045 1.32131099]
Beta Std Error: [0.195503 1.384838 34.891521 45.950556]
Beta Covariance: [[ 3.82214235e-02 -1.05423284e-02 -1.99742825e+00 2.63681933e+00]
[ -1.05423284e-02 1.91777505e+00 1.27300761e+01 -1.67054172e+01]
[ -1.99742825e+00 1.27300761e+01 1.21741826e+03 -1.60328181e+03]
[ 2.63681933e+00 -1.67054172e+01 -1.60328181e+03 2.11145361e+03]]
Residual Variance: 6.24982904455 (calulated by me)
My Point is the third fit parameter: The results are
scipy.odr.ODR, fit_type=2:
C = 100.209 +/- 80600
scipy.optimize.leastsq:
C = 99.432 +/- 12.730
I don't know why the first error is so much higher. Even better: If I put exactly the same data points with errors into Origin 9 I get C = x0 = 99,41849 +/- 0,20283
and again exactly the same data into c++ ROOT Cern C = 99.85+/- 1.373
even though I used exactly the same initial variables for ROOT and Python. Origin doesn't need any.
Do you have any clue why this happens and which is the best result?
I added the code for you at pastebin:
http://pastebin.com/jZVyzMkS
Thank you for helping!
EDIT: here's the plot related to SirJohnFranklins post:
Did you actually try plotting the ODR
and leastsq
fits side by side? They look basically identical:
Consider what the parameters correspond to - the step function described by beta[0]
and beta[1]
, the initial and final values, explains by far the majority of the variance in your data. By contrast, small changes in beta[2]
and beta[3]
, the inflexion point and slope, will have comparatively little effect on the overall shape of the curve and therefore the residual variance for the fit. It's therefore no surprise that these parameters have high standard errors, and are fitted slightly differently by the two algorithms.
The overall greater standard errors reported by ODR
are due to the fact that this model incorporates errors in the y-values whereas the ordinary least squares fit does not - errors in the measured y-values ought to reduce our confidence in the estimated fit parameters.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With