What's the error of numpy.polyfit?

2 Answers

If you specify full=True in your call to polyfit, it will include extra information:

>>> x = np.arange(100) >>> y = x**2 + 3*x + 5 + np.random.rand(100) >>> np.polyfit(x, y, 2) array([ 0.99995888,  3.00221219,  5.56776641]) >>> np.polyfit(x, y, 2, full=True) (array([ 0.99995888,  3.00221219,  5.56776641]), # coefficients  array([ 7.19260721]), # residuals  3, # rank  array([ 11.87708199,   3.5299267 ,   0.52876389]), # singular values  2.2204460492503131e-14) # conditioning threshold

The residual value returned is the sum of the squares of the fit errors, not sure if this is what you are after:

>>> np.sum((np.polyval(np.polyfit(x, y, 2), x) - y)**2) 7.1926072073491056

In version 1.7 there is also a cov keyword that will return the covariance matrix for your coefficients, which you could use to calculate the uncertainty of the fit coefficients themselves.

129

answered Oct 09 '22 02:10

Jaime

As you can see in the documentation:

Returns ------- p : ndarray, shape (M,) or (M, K)     Polynomial coefficients, highest power first.     If `y` was 2-D, the coefficients for `k`-th data set are in ``p[:,k]``.  residuals, rank, singular_values, rcond : present only if `full` = True     Residuals of the least-squares fit, the effective rank of the scaled     Vandermonde coefficient matrix, its singular values, and the specified     value of `rcond`. For more details, see `linalg.lstsq`.

Which means that if you can do a fit and get the residuals as:

 import numpy as np  x = np.arange(10)  y = x**2 -3*x + np.random.random(10)   p, res, _, _, _ = numpy.polyfit(x, y, deg, full=True)

Then, the p are your fit parameters, and the res will be the residuals, as described above. The _'s are because you don't need to save the last three parameters, so you can just save them in the variable _ which you won't use. This is a convention and is not required.

@Jaime's answer explains what the residual means. Another thing you can do is look at those squared deviations as a function (the sum of which is res). This is particularly helpful to see a trend that didn't fit sufficiently. res can be large because of statistical noise, or possibly systematic poor fitting, for example:

x = np.arange(100) y = 1000*np.sqrt(x) + x**2 - 10*x + 500*np.random.random(100) - 250  p = np.polyfit(x,y,2) # insufficient degree to include sqrt  yfit = np.polyval(p,x)  figure() plot(x,y, label='data') plot(x,yfit, label='fit') plot(x,yfit-y, label='var')

So in the figure, note the bad fit near x = 0:
polyfit

answered Oct 09 '22 02:10

askewchan

Related questions
                            
                                Selenium gives "selenium.common.exceptions.WebDriverException: Message: unknown error: cannot find Chrome binary" on Mac
                            
                                How to treat NULL as a normal string with pandas?
                            
                                Python OpenCV cv2 drawing rectangle with text
                            
                                Equivalent of j in NumPy
                            
                                Testing socket connection in Python
                            
                                How to compute the nth root of a very big integer
                            
                                Best way to determine if a sequence is in another sequence?
                            
                                Python urllib2 Progress Hook
                            
                                Python: Why does os.getcwd() sometimes crash with OSError?
                            
                                Convert image to a matrix in python
                            
                                Move an email in GMail with Python and imaplib
                            
                                `goto` in Python
                            
                                Matplotlib table formatting
                            
                                What does "\r" do in the following script?
                            
                                Python- how do I use re to match a whole string [duplicate]
                            
                                How to check for mock calls with wildcards?
                            
                                performing outer addition with numpy
                            
                                python refresh/reload
                            
                                Linewidth is added to the length of a line
                            
                                Compare only time part in datetime - Python

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

What's the error of numpy.polyfit?

Tags:

python

numpy

curve-fitting

varantir

People also ask

2 Answers

Jaime

askewchan

Recent Activity

Donate For Us