Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Wrong fit using scipy curve_fit

I am trying to fit some data to a power law function with exponential cut off. I generate some data with numpy and i am trying to fit those data with scipy.optimization. Here is my code:

import numpy as np
from scipy.optimize import curve_fit

def func(x, A, B, alpha):
    return A * x**alpha * np.exp(B * x)

xdata = np.linspace(1, 10**8, 1000)
ydata = func(xdata, 0.004, -2*10**-8, -0.75)
popt, pcov = curve_fit(func, xdata, ydata)
print popt

The result of I am getting is: [1, 1, 1] which does not correspond with the data. ¿Am i doing something wrong?

like image 722
ivangtorre Avatar asked Nov 17 '15 11:11

ivangtorre


People also ask

How does curve_fit work Scipy?

The SciPy open source library provides the curve_fit() function for curve fitting via nonlinear least squares. The function takes the same input and output data as arguments, as well as the name of the mapping function to use. The mapping function must take examples of input data and some number of arguments.

What does Scipy curve_fit return?

The curve_fit() function returns an optimal parameters and estimated covariance values as an output. Now, we'll start fitting the data by setting the target function, and x, y data into the curve_fit() function and get the output data which contains a, b, and c parameter values.

How can I improve my curve fit?

How to get the best fit: For this, we can use RMS(Root Mean square) or simple linear regression. We can change the order of polynomial to get the perfect fit. ANd the best way is, split the data in two or more data folds and do the curve fit.

What is p0 in curve_fit?

What is p0 in Curve_fit? The 'p0' parameter is an N length initial guess for the parameters. It is an optional parameter, and hence, if it is not provided, then the initial value will be 1. The curve_fit() function will return the two values; popt and pcov.


1 Answers

Whilst xnx gave you the answer as to why curve_fit failed here I thought I'd suggest a different way of approaching the problem of fitting your functional form which doesn't rely on a gradient descent (and therefore a reasonable initial guess)

Note that if you take the log of the function that you are fitting you get the form

\log f = \log A + \alpha \log x + B x

Which is linear in each of the unknown parameters (log A, alpha, B)

We can therefore use the machinery of linear algebra to solve this by writing the equation in the form of a matrix as

log y = M p

Where log y is a column vector of the log of your ydata points, p is a column vector of the unknown parameters and M is the matrix [[1], [log x], [x]]

Or explicitly

enter image description here

The best fitting parameter vector can then be found efficiently by using np.linalg.lstsq

Your example problem in code could then be written as

import numpy as np

def func(x, A, B, alpha):
    return A * x**alpha * np.exp(B * x)

A_true = 0.004
alpha_true = -0.75
B_true = -2*10**-8

xdata = np.linspace(1, 10**8, 1000)
ydata = func(xdata, A_true, B_true, alpha_true)

M = np.vstack([np.ones(len(xdata)), np.log(xdata), xdata]).T

logA, alpha, B = np.linalg.lstsq(M, np.log(ydata))[0]

print "A =", np.exp(logA)
print "alpha =", alpha
print "B =", B

Which recovers the initial parameters nicely:

A = 0.00400000003736
alpha = -0.750000000928
B = -1.9999999934e-08

Also note that this method is around 20x faster than using curve_fit for the problem at hand

In [8]: %timeit np.linalg.lstsq(np.vstack([np.ones(len(xdata)), np.log(xdata), xdata]).T, np.log(ydata))
10000 loops, best of 3: 169 µs per loop


In [2]: %timeit curve_fit(func, xdata, ydata, [0.01, -5e-7, -0.4])
100 loops, best of 3: 4.44 ms per loop
like image 198
Simon Gibbons Avatar answered Sep 17 '22 12:09

Simon Gibbons