Suppose I have a hypotetical function I'd like to approximate:
def f(x):
return a * x ** 2 + b * x + c
Where a, b and c are the values I don't know.
And I have certain points where the function output is known, i.e.
x = [-1, 2, 5, 100]
y = [123, 456, 789, 1255]
(actually there are way more values)
I'd like to get a, b and c while minimizing the squared error (and additionally get that squared error).
What is the way to do that in Python?
There should be existing solutions in scipy, numpy or anywhere like that.
Since the function you're trying to fit is a polynomial, you can use numpy.polyfit
>>> numpy.polyfit(x, y, 2) # The 2 signifies a polynomial of degree 2
array([ -1.04978546, 115.16698544, 236.16191491])
This means that the best fit was y ~ -1.05 x2 + 115.157x + 236.16.
For a general function, the more you know about it (e.g., is it convex, differentiable, twice differentiable, etc.), the better you can do with scipy.optimize.minimize. E.g., if you hardly know anything about it, you can use it specifying to use the Nelder-Mead method. Other methods there (refer to the documentation) can make use of the Jacobian and the Hessian, if they are defined, and you can calculate them.
Personally, I find that using it with Nelder-Mead (requiring almost no parameters) gives me adequate results for my needs.
Example
Suppose you're trying to fit y = kx with k as the parameter to optimize. You'd write a function
x = ...
y = ...
def ss(k):
# use numpy.linalg.norm to find the sum-of-squares error between y and kx
Then you'd use scipy.optimize.minimize on the function ss.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With