If we want to search for the optimal parameters theta for a linear regression model by using the normal equation with:
theta = inv(X^T * X) * X^T * y
one step is to calculate inv(X^T*X). Therefore numpy provides np.linalg.inv() and np.linalg.pinv()
Though this leads to different results:
X=np.matrix([[1,2104,5,1,45],[1,1416,3,2,40],[1,1534,3,2,30],[1,852,2,1,36]])
y=np.matrix([[460],[232],[315],[178]])
XT=X.T
XTX=XT@X
pinv=np.linalg.pinv(XTX)
theta_pinv=(pinv@XT)@y
print(theta_pinv)
[[188.40031946]
[ 0.3866255 ]
[-56.13824955]
[-92.9672536 ]
[ -3.73781915]]
inv=np.linalg.inv(XTX)
theta_inv=(inv@XT)@y
print(theta_inv)
[[-648.7890625 ]
[ 0.79418945]
[-110.09375 ]
[ -74.0703125 ]
[ -3.69091797]]
The first output, that is the output of pinv is the correct one and additionally recommended in the numpy.linalg.pinv() docs. But why is this and where are the differences / Pros / Cons between inv() and pinv().
The inv() function returns the inverse of the matrix. The pinv() function is useful when your matrix is non-invertible(singular matrix) or Determinant of that Matrix =0. The inv() function will not be useful if your matrix is non-invertible(singular matrix).
pinv. Compute the (Moore-Penrose) pseudo-inverse of a matrix. Calculate the generalized inverse of a matrix using its singular-value decomposition (SVD) and including all large singular values.
We use numpy. linalg. inv() function to calculate the inverse of a matrix. The inverse of a matrix is such that if it is multiplied by the original matrix, it results in identity matrix.
Numpy linalg solve() function is used to solve a linear matrix equation or a system of linear scalar equation. The solve() function calculates the exact x of the matrix equation ax=b where a and b are given matrices.
If the determinant of the matrix is zero it will not have an inverse and your inv function will not work. This usually happens if your matrix is singular.
But pinv will. This is because pinv returns the inverse of your matrix when it is available and the pseudo inverse when it isn't.
The different results of the functions are because of rounding errors in floating point arithmetic
You can read more about how pseudo inverse works here
inv
and pinv
are used to compute the (pseudo)-inverse as a standalone matrix. Not to actually use them in the computations.
For such linear system solutions the proper tool to use is numpy.linalg.lstsq
(or from scipy) if you have a non invertible coefficient matrix or numpy.linalg.solve
(or from scipy) for invertible matrices.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With