What is the difference between https://docs.scipy.org/doc/numpy/reference/generated/numpy.polyfit.html and https://docs.scipy.org/doc/numpy/reference/generated/numpy.polynomial.polynomial.polyfit.html and which one should I use when? I checked the code and however both use numpy.linalg.linalg.lstsq at their code, but are different otherwise. The documentation of numpy.polyfit also suggests to use https://docs.scipy.org/doc/numpy/reference/generated/numpy.polynomial.polynomial.Polynomial.fit.html What is the right choice? (Bonus: How would I use the class when the first thing I want to do is to fit to my data?)

From what I can tell there's a lot of legacy baggage here, and we should not be using <code>numpy.polyfit</code>, and we should prefer <code>numpy.polynomial.polynomial.Polynomial.fit</code>. Consider comments on this github issue from 2016: <blockquote> While the documentation is reasonably clear in noting that coefficients are returned [from <code>numpy.polyfit</code>—ed.] with highest-order last, this is fairly easy to miss, and is inconsistent with, e.g. <code>numpy.polynomial.polynomial.polyfit()</code>. </blockquote> And a bit later <blockquote> Having the zero-degree coefficient first, as done in <code>numpy.polynomial.polynomial.polyfit</code> is definitely more logical. I was under the impression that the only reason <code>numpy.polyfit</code> deviates from this is historical accident, which of course is nearly impossible to correct now since many programmes may depend on this behaviour. Maybe the easiest solution would be to point people to the "preferred" solution in <code>numpy.polyfit</code>? </blockquote> From an earlier comment it's evident that the "historical accident" is the behaviour of MATLAB's <code>polyfit</code> that takes high orders first. Early numpy kept this confusing convention (which it may have even inherited from a predecessor of the project), but later <code>numpy.polynomial.polynomial.polyfit</code> was implemented to Do It Right™. The crucial difference is that (unlike MATLAB) python uses 0-based indices, in which case it's perfectly natural to have zeroth order first. With this convention there's the beautiful property that item <code>k</code> corresponds to the term <code>x**k</code>. Then there's a newer account in another issue from this year that tries to give a more coherent picture. Quoting the historical recollection from the issue: <blockquote> <h3>History</h3> (not necessarily in chronological order) <ol> <li>A certain JVM-based linear algebra package had a function, <code>polyfit</code>, for fitting polynomials which made some weird design choices, like returning the coefficients highest-degree first.</li> <li>numpy, in an attempt to support fugitives from said environment, created the function <code>numpy.polyfit</code> which aped that design choice</li> <li>numpy implemented <code>numpy.ma.polyfit</code> for masked arrays, using <code>numpy.polyfit</code> </li> <li>In an attempt to fix the mistakes of history, numpy created the function <code>numpy.polynomial.polynomial.polyfit</code> with almost exactly the same signature, but with a more sensible coefficient ordering, and quietly preferred that people use that instead</li> <li>People were confused by these two very similar functions (#7478); also the new function could not return a covariance matrix and it did not have a masked array counterpart</li> <li>Powering on towards both API nirvana and worn-out keyboards, numpy introduced the <code>numpy.polynomial.polynomial.Polynomial</code> class, and documented in <code>numpy.polyfit</code> that that was the preferred way of fitting polynomials, although that also had no masked implementation and also did not return a covariance matrix</li> </ol> </blockquote> The responses from devs on the two issues make it clear that <code>numpy.polyfit</code> is technical debt, and as its documentation says new code should use the <code>Polynomial</code> class. The documentations have improved a lot since 2016, in that now there are pointers from <code>numpy.polyfit</code> to <code>Polynomial</code>, but there's still a lot of ambiguity. Ideally both <code>polyfit</code> methods should explain their situation with respect to the other, and point users to the <code>Polynomial</code> class as the one obvious way to write new code.

Should I use numpy.polyfit or numpy.polynomial.polyfit or numpy.polynomial.polynomial.Polynomial?

1 Answers

From what I can tell there's a lot of legacy baggage here, and we should not be using numpy.polyfit, and we should prefer numpy.polynomial.polynomial.Polynomial.fit.

Consider comments on this github issue from 2016:

While the documentation is reasonably clear in noting that coefficients are returned [from numpy.polyfit—ed.] with highest-order last, this is fairly easy to miss, and is inconsistent with, e.g. numpy.polynomial.polynomial.polyfit().

And a bit later

Having the zero-degree coefficient first, as done in numpy.polynomial.polynomial.polyfit is definitely more logical. I was under the impression that the only reason numpy.polyfit deviates from this is historical accident, which of course is nearly impossible to correct now since many programmes may depend on this behaviour. Maybe the easiest solution would be to point people to the "preferred" solution in numpy.polyfit?

From an earlier comment it's evident that the "historical accident" is the behaviour of MATLAB's polyfit that takes high orders first. Early numpy kept this confusing convention (which it may have even inherited from a predecessor of the project), but later numpy.polynomial.polynomial.polyfit was implemented to Do It Right™. The crucial difference is that (unlike MATLAB) python uses 0-based indices, in which case it's perfectly natural to have zeroth order first. With this convention there's the beautiful property that item k corresponds to the term x**k.

Then there's a newer account in another issue from this year that tries to give a more coherent picture. Quoting the historical recollection from the issue:

History

(not necessarily in chronological order)

A certain JVM-based linear algebra package had a function, polyfit, for fitting polynomials which made some weird design choices, like returning the coefficients highest-degree first.

numpy, in an attempt to support fugitives from said environment, created the function numpy.polyfit which aped that design choice

numpy implemented numpy.ma.polyfit for masked arrays, using numpy.polyfit

In an attempt to fix the mistakes of history, numpy created the function numpy.polynomial.polynomial.polyfit with almost exactly the same signature, but with a more sensible coefficient ordering, and quietly preferred that people use that instead

People were confused by these two very similar functions (#7478); also the new function could not return a covariance matrix and it did not have a masked array counterpart

Powering on towards both API nirvana and worn-out keyboards, numpy introduced the numpy.polynomial.polynomial.Polynomial class, and documented in numpy.polyfit that that was the preferred way of fitting polynomials, although that also had no masked implementation and also did not return a covariance matrix

The responses from devs on the two issues make it clear that numpy.polyfit is technical debt, and as its documentation says new code should use the Polynomial class. The documentations have improved a lot since 2016, in that now there are pointers from numpy.polyfit to Polynomial, but there's still a lot of ambiguity. Ideally both polyfit methods should explain their situation with respect to the other, and point users to the Polynomial class as the one obvious way to write new code.

104

answered Sep 20 '22 08:09

Andras Deak -- Слава Україні

Related questions
                            
                                Spring Boot App not picking up application.properties from dependent jar
                            
                                ag-grid Jest testing - this.btFirst.insertAdjacentElement is not a function
                            
                                Why does ST_MakeValid() strip SRID from already-defined geometries?
                            
                                ASP.NET Core 3.0 - Identity UI Manage folder not receiving layout
                            
                                How to import modules from a folder outside node modules in next js
                            
                                Can we access Apple watch tracking data through a WEB/REST API?
                            
                                How do I override the `**` operator used for kwargs in variadic function for my own user-defined classes? [duplicate]
                            
                                Selenium Webdriver click and send keys etc dont work on Google Chrome Version 78.0.3904.87
                            
                                Prevent VSCode from unfolding code when cursor moves past folded section
                            
                                order bars in bar chart by value in descending order with plotly-express
                            
                                How to upload files using JDK 11 java.net.http.HttpClient?
                            
                                How to return an image to the client using Nest.js framework?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Should I use numpy.polyfit or numpy.polynomial.polyfit or numpy.polynomial.polynomial.Polynomial?

Tags:

Make42

People also ask

1 Answers

History

Andras Deak -- Слава Україні

Recent Activity

Donate For Us