Suppose I have a hypotetical function I'd like to approximate:
def f(x):
return a * x ** 2 + b * x + c
Where a
, b
and c
are the values I don't know.
And I have certain points where the function output is known, i.e.
x = [-1, 2, 5, 100]
y = [123, 456, 789, 1255]
(actually there are way more values)
I'd like to get a
, b
and c
while minimizing the squared error (and additionally get that squared error).
What is the way to do that in Python?
There should be existing solutions in scipy
, numpy
or anywhere like that.
Since the function you're trying to fit is a polynomial, you can use numpy.polyfit
>>> numpy.polyfit(x, y, 2) # The 2 signifies a polynomial of degree 2
array([ -1.04978546, 115.16698544, 236.16191491])
This means that the best fit was y ~ -1.05 x2 + 115.157x + 236.16.
For a general function, the more you know about it (e.g., is it convex, differentiable, twice differentiable, etc.), the better you can do with scipy.optimize.minimize
. E.g., if you hardly know anything about it, you can use it specifying to use the Nelder-Mead method. Other methods there (refer to the documentation) can make use of the Jacobian and the Hessian, if they are defined, and you can calculate them.
Personally, I find that using it with Nelder-Mead (requiring almost no parameters) gives me adequate results for my needs.
Example
Suppose you're trying to fit y = kx with k as the parameter to optimize. You'd write a function
x = ...
y = ...
def ss(k):
# use numpy.linalg.norm to find the sum-of-squares error between y and kx
Then you'd use scipy.optimize.minimize
on the function ss
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With