I want to calculate the least squares estimate for given data.
There are a few ways to do this, one is to use numpy's least squares:
import numpy
np.linalg.lstsq(X,y)[0]
Where X is a matrix and y a vector of compatible dimension (type float64). Second way is to calculate the result directly using the formula:
import numpy
numpy.linalg.inv(X.T.dot(X)).dot(X.T).dot(y)
My problem: there are cases where the different formulas give radically different results (although there may be no difference). Sometimes the coefficients grow to be extremely large, using one formula, while the other is much more well behaved. The formulas are the same so why can the results diverge so much? Is this some type of rounding error and how do I minimize it?
linalg. lstsq. Return the least-squares solution to a linear matrix equation.
The article explains how to solve a system of linear equations using Python's Numpy library. You can either use linalg. inv() and linalg. dot() methods in chain to solve a system of linear equations, or you can simply use the solve() method.
solve. Solve a linear matrix equation, or system of linear scalar equations. Computes the “exact” solution, x, of the well-determined, i.e., full rank, linear matrix equation ax = b.
While those two formulas are mathematically equivalent, they are not numerically equivalent! There are better ways to solve a system of linear equations Ax = b than by multiplying both sides by A^(-1), like Gaussian Elimination. numpy.linalg.lstsq
uses this (and more sophisticated) methods to solve the underlying linear system, plus it can handle a lot of corner cases. So use it when you can.
Matrix inversion is very numerically unstable. Don't do it unless you have to.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With