Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Efficient computation of the least-squares algorithm in NumPy

Tags:

I need to solve a large set of linear systems, in the least-squares sense. I am having trouble in understanding the difference in computational efficiency of numpy.linalg.lstsq(a, b), np.dot(np.linalg.pinv(a), b) and the mathematical implementation.

I use the following matrices:

h=np.random.random((50000,100))
a=h[:,:-1].copy()
b=-h[:,-1].copy()

and the results of the algorithms are:


# mathematical implementation
%%timeit
np.dot(np.dot(np.linalg.inv(np.dot(a.T,a)),a.T),b)

10 loops, best of 3: 36.3 ms per loop


# numpy.linalg.lstsq implementation
%%timeit
np.linalg.lstsq(a, b)[0]

10 loops, best of 3: 103 ms per loop


%%timeit
np.dot(np.linalg.pinv(a), b)

1 loop, best of 3: 216 ms per loop


Why is there a difference?