Fitting polynomials to data

Tags:

statistics

Is there a way, given a set of values (x,f(x)), to find the polynomial of a given degree that best fits the data?

I know polynomial interpolation, which is for finding a polynomial of degree n given n+1 data points, but here there are a large number of values and we want to find a low-degree polynomial (find best linear fit, best quadratic, best cubic, etc.). It might be related to least squares...

More generally, I would like to know the answer when we have a multivariate function -- points like (x,y,f(x,y)), say -- and want to find the best polynomial (p(x,y)) of a given degree in the variables. (Specifically a polynomial, not splines or Fourier series.)

Both theory and code/libraries (preferably in Python, but any language is okay) would be useful.

727

asked Dec 19 '08 20:12

ShreevatsaR

2 Answers

Thanks for everyone's replies. Here is another attempt at summarizing them. Pardon if I say too many "obvious" things: I knew nothing about least squares before, so everything was new to me.

NOT polynomial interpolation

Polynomial interpolation is fitting a polynomial of degree n given n+1 data points, e.g. finding a cubic that passes exactly through four given points. As said in the question, this was not want I wanted—I had a lot of points and wanted a small-degree polynomial (which will only approximately fit, unless we've been lucky)—but since some of the answers insisted on talking about it, I should mention them :) Lagrange polynomial, Vandermonde matrix, etc.

What is least-squares?

"Least squares" is a particular definition/criterion/"metric" of "how well" a polynomial fits. (There are others, but this is simplest.) Say you are trying to fit a polynomial p(x,y) = a + bx + cy + dx² + ey² + fxy to some given data points (x_i,y_i,Z_i) (where "Z_i" was "f(x_i,y_i)" in the question). With least-squares the problem is to find the "best" coefficients (a,b,c,d,e,f), such that what is minimized (kept "least") is the "sum of squared residuals", namely

S = ∑_i (a + bx_i + cy_i + dx_i² + ey_i² + fx_iy_i - Z_i)²

Theory

The important idea is that if you look at S as a function of (a,b,c,d,e,f), then S is minimized at a point at which its gradient is 0. This means that for example ∂S/∂f=0, i.e. that

∑_i2(a + … + fx_iy_i - Z_i)x_iy_i = 0

and similar equations for a, b, c, d, e. Note that these are just linear equations in a…f. So we can solve them with Gaussian elimination or any of the usual methods.

This is still called "linear least squares", because although the function we wanted was a quadratic polynomial, it is still linear in the parameters (a,b,c,d,e,f). Note that the same thing works when we want p(x,y) to be any "linear combination" of arbitrary functions f_j, instead of just a polynomial (= "linear combination of monomials").

Code

For the univariate case (when there is only variable x — the f_j are monomials x^j), there is Numpy's polyfit:

>>> import numpy >>> xs = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] >>> ys = [1.1, 3.9, 11.2, 21.5, 34.8, 51, 70.2, 92.3, 117.4, 145.5] >>> p = numpy.poly1d(numpy.polyfit(xs, ys, deg=2)) >>> print p        2 1.517 x + 2.483 x + 0.4927

For the multivariate case, or linear least squares in general, there is SciPy. As explained in its documentation, it takes a matrix A of the values f_j(x_i). (The theory is that it finds the Moore-Penrose pseudoinverse of A.) With our above example involving (x_i,y_i,Z_i), fitting a polynomial means the f_j are the monomials x⁽⁾y⁽⁾. The following finds the best quadratic (or best polynomial of any other degree, if you change the "degree = 2" line):

from scipy import linalg import random  n = 20 x = [100*random.random() for i in range(n)] y = [100*random.random() for i in range(n)] Z = [(x[i]+y[i])**2 + 0.01*random.random() for i in range(n)]  degree = 2 A = [] for i in range(n):     A.append([])     for xd in range(degree+1):         for yd in range(degree+1-xd):             A[i].append((x[i]**xd)*(y[i]**yd)) #f_j(x_i)  c,_,_,_ = linalg.lstsq(A,Z) j = 0 for xd in range(0,degree+1):     for yd in range(0,degree+1-xd):         print " + (%.2f)x^%dy^%d" % (c[j], xd, yd),         j += 1

prints

 + (0.01)x^0y^0  + (-0.00)x^0y^1  + (1.00)x^0y^2  + (-0.00)x^1y^0  + (2.00)x^1y^1  + (1.00)x^2y^0

so it has discovered that the polynomial is x²+2xy+y²+0.01. [The last term is sometimes -0.01 and sometimes 0, which is to be expected because of the random noise we added.]

Alternatives to Python+Numpy/Scipy are R and Computer Algebra Systems: Sage, Mathematica, Matlab, Maple. Even Excel might be able to do it. Numerical Recipes discusses methods to implement it ourselves (in C, Fortran).

Concerns

It is strongly influenced by how the points are chosen. When I had x=y=range(20) instead of the random points, it always produced 1.33x²+1.33xy+1.33y², which was puzzling... until I realised that because I always had x[i]=y[i], the polynomials were the same: x²+2xy+y² = 4x² = (4/3)(x²+xy+y²). So the moral is that it is important to choose the points carefully to get the "right" polynomial. (If you can chose, you should choose Chebyshev nodes for polynomial interpolation; not sure if the same is true for least squares as well.)
Overfitting: higher-degree polynomials can always fit the data better. If you change the degree to 3 or 4 or 5, it still mostly recognizes the same quadratic polynomial (coefficients are 0 for higher-degree terms) but for larger degrees, it starts fitting higher-degree polynomials. But even with degree 6, taking larger n (more data points instead of 20, say 200) still fits the quadratic polynomial. So the moral is to avoid overfitting, for which it might help to take as many data points as possible.
There might be issues of numerical stability I don't fully understand.
If you don't need a polynomial, you can obtain better fits with other kinds of functions, e.g. splines (piecewise polynomials).

answered Oct 02 '22 23:10

7 revs, 2 users 99%

Yes, the way this is typically done is by using least squares. There are other ways of specifying how well a polynomial fits, but the theory is simplest for least squares. The general theory is called linear regression.

Your best bet is probably to start with Numerical Recipes.

R is free and will do everything you want and more, but it has a big learning curve.

If you have access to Mathematica, you can use the Fit function to do a least squares fit. I imagine Matlab and its open source counterpart Octave have a similar function.

answered Oct 02 '22 21:10

John D. Cook

Related questions
                            
                                Bash: evaluate a mathematical term?
                            
                                What are the chances that two messages have the same MD5 digest and the same SHA1 digest?
                            
                                Mapping N-dimensional value to a point on Hilbert curve
                            
                                Is there a calculator with LaTeX-syntax?
                            
                                How do I calculate a trendline for a graph?
                            
                                Shortest command to calculate the sum of a column of output on Unix?
                            
                                How to divide numbers without remainder in PHP?
                            
                                Math operations from string [duplicate]
                            
                                What does a circled plus mean?
                            
                                JS how to find the greatest common divisor [closed]
                            
                                Round *UP* to the nearest 100 in SQL Server
                            
                                Inverse Cosine in Python
                            
                                How to calculate mean, median, mode and range from a set of numbers
                            
                                making a variable value positive
                            
                                Obtain Latitude and Longitude from a GeoTIFF File
                            
                                How can I write a power function myself?
                            
                                In Java how do you convert a decimal number to base 36?
                            
                                Is there any advantage to using pow(x,2) instead of x*x, with x double?
                            
                                Good algorithm for finding the diameter of a (sparse) graph?
                            
                                Runge-Kutta (RK4) integration for game physics

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With