Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Errors to fit parameters of scipy.optimize

I use the scipy.optimize.minimize ( https://docs.scipy.org/doc/scipy/reference/tutorial/optimize.html ) function with method='L-BFGS-B.

An example of what it returns is here above:

      fun: 32.372210618549758
 hess_inv: <6x6 LbfgsInvHessProduct with dtype=float64>
     jac: array([ -2.14583906e-04,   4.09272616e-04,  -2.55795385e-05,
         3.76587650e-05,   1.49213975e-04,  -8.38440428e-05])
  message: 'CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH'
     nfev: 420
      nit: 51
   status: 0
  success: True
        x: array([ 0.75739412, -0.0927572 ,  0.11986434,  1.19911266,  0.27866406,
       -0.03825225])

The x value correctly contains the fitted parameters. How do I compute the errors associated to those parameters?

like image 569
Antonio Ragagnin Avatar asked Apr 24 '17 16:04

Antonio Ragagnin


People also ask

How do I stop Scipy optimizing minimize?

optimize. minimize can be terminated by using tol and maxiter (maxfev also for some optimization methods). There are also some method-specific terminators like xtol, ftol, gtol, etc., as mentioned on scipy.

Is Scipy optimize multithreaded?

NumPy/SciPy's functions are usually optimized for multithreading. Did you look at your CPU utilization to confirm that only one core is being used while the simulation is being ran? Otherwise you have nothing to gain from running multiple instances.

What is POPT and PCOV?

The return value popt contains the best-fit values of the parameters. The return value pcov contains the covariance (error) matrix for the fit parameters. From them we can determine the standard deviations of the parameters, just as we did for linear least chi square.

What is the minimize function in SciPy?

The minimize function provides a common interface to unconstrained and constrained minimization algorithms for multivariate scalar functions in scipy.optimize. To demonstrate the minimization function, consider the problem of minimizing the Rosenbrock function of N variables:

How to get the optimal values of the parameters in SciPy?

You can use scipy.optimize.curve_fit : This method does not only return the estimated optimal values of the parameters, but also the corresponding covariance matrix: Optimal values for the parameters so that the sum of the squared residuals of f (xdata, *popt) – ydata is minimized

Which optimization algorithms are available in SciPy?

The scipy.optimize package provides several commonly used optimization algorithms. A detailed listing is available: scipy.optimize (can also be found by help (scipy.optimize) ). The minimize function provides a common interface to unconstrained and constrained minimization algorithms for multivariate scalar functions in scipy.optimize.

How do I get the correct errors in the fit parameters?

Getting the correct errors in the fit parameters can be subtle in most cases. Let's think about fitting a function y=f (x) for which you have a set of data points (x_i, y_i, yerr_i), where i is an index that runs over each of your data points.


2 Answers

TL;DR: You can actually place an upper bound on how precisely the minimization routine has found the optimal values of your parameters. See the snippet at the end of this answer that shows how to do it directly, without resorting to calling additional minimization routines.


The documentation for this method says

The iteration stops when (f^k - f^{k+1})/max{|f^k|,|f^{k+1}|,1} <= ftol.

Roughly speaking, the minimization stops when the value of the function f that you're minimizing is minimized to within ftol of the optimum. (This is a relative error if f is greater than 1, and absolute otherwise; for simplicity I'll assume it's an absolute error.) In more standard language, you'll probably think of your function f as a chi-squared value. So this roughly suggests that you would expect

\chi^2_{\mathrm{found}} - \chi^2_{\mathrm{optimal}} \lesssim \mathtt{ftol}

Of course, just the fact that you're applying a minimization routine like this assumes that your function is well behaved, in the sense that it's reasonably smooth and the optimum being found is well approximated near the optimum by a quadratic function of the parameters xi:

\chi^2_{\mathrm{found}} - \chi^2_{\mathrm{optimal}} \approx \sum_{i,j} H_{ij} \Delta x^i \Delta x^j

where Δxi is the difference between the found value of parameter xi and its optimal value, and Hij is the Hessian matrix. A little (surprisingly nontrivial) linear algebra gets you to a pretty standard result for an estimate of the uncertainty in any quantity X that's a function of your parameters xi:

(\Delta X)^2 = \left(\chi^2_{\mathrm{found}} - \chi^2_{\mathrm{optimal}}\right) \sum_{i,j} \left(H^{-1}\right)_{ij} \\frac{\partial X}{\partial x^i} \frac{\partial X}{\partial x^j}

which lets us write

(\Delta X)^2 \lesssim \mathtt{ftol} \sum_{i,j} \left(H^{-1}\right)_{ij} \\frac{\partial X}{\partial x^i} \frac{\partial X}{\partial x^j}

That's the most useful formula in general, but for the specific question here, we just have X = xi, so this simplifies to

\Delta x^i \lesssim \sqrt{\mathtt{ftol} \left(H^{-1}\right)_{ii}}

Finally, to be totally explicit, let's say you've stored the optimization result in a variable called res. The inverse Hessian is available as res.hess_inv, which is a function that takes a vector and returns the product of the inverse Hessian with that vector. So, for example, we can display the optimized parameters along with the uncertainty estimates with a snippet like this:

ftol = 2.220446049250313e-09
tmp_i = np.zeros(len(res.x))
for i in range(len(res.x)):
    tmp_i[i] = 1.0
    hess_inv_i = res.hess_inv(tmp_i)[i]
    uncertainty_i = np.sqrt(max(1, abs(res.fun)) * ftol * hess_inv_i)
    tmp_i[i] = 0.0
    print('x^{0} = {1:12.4e} ± {2:.1e}'.format(i, res.x[i], uncertainty_i))

Note that I've incorporated the max behavior from the documentation, assuming that f^k and f^{k+1} are basically just the same as the final output value, res.fun, which really ought to be a good approximation. Also, for small problems, you can just use np.diag(res.hess_inv.todense()) to get the full inverse and extract the diagonal all at once. But for large numbers of variables, I've found that to be a much slower option. Finally, I've added the default value of ftol, but if you change it in an argument to minimize, you would obviously need to change it here.

like image 93
Mike Avatar answered Jan 01 '23 23:01

Mike


One approach to this common problem is to use scipy.optimize.leastsq after using minimize with 'L-BFGS-B' starting from the solution found with 'L-BFGS-B'. That is, leastsq will (normally) include and estimate of the 1-sigma errors as well as the solution.

Of course, that approach makes several assumption, including that leastsq can be used and may be appropriate for solving the problem. From a practical view, this requires the objective function return an array of residual values with at least as many elements as variables, not a cost function.

You may find lmfit (https://lmfit.github.io/lmfit-py/) useful here: It supports both 'L-BFGS-B' and 'leastsq' and gives a uniform wrapper around these and other minimization methods, so that you can use the same objective function for both methods (and specify how to convert the residual array into the cost function). In addition, parameter bounds can be used for both methods. This makes it very easy to first do a fit with 'L-BFGS-B' and then with 'leastsq', using the values from 'L-BFGS-B' as starting values.

Lmfit also provides methods to more explicitly explore confidence limits on parameter values in more detail, in case you suspect the simple but fast approach used by leastsq might be insufficient.

like image 39
M Newville Avatar answered Jan 01 '23 22:01

M Newville