I have always assumed scipy.linalg.norm()
and numpy.linalg.norm()
to be equivalent (scipy version used to not accept an axis argument, but now it does). However the following simple examples yields significantly different performances: what is the reason behind that?
In [1]: from scipy.linalg import norm as normsp
In [2]: from numpy.linalg import norm as normnp
In [3]: import numpy as np
In [4]: a = np.random.random(size=(1000, 2000))
In [5]: %timeit normsp(a)
The slowest run took 5.69 times longer than the fastest. This could mean that an intermediate result is being cached.
100 loops, best of 3: 2.85 ms per loop
In [6]: %timeit normnp(a)
The slowest run took 6.39 times longer than the fastest. This could mean that an intermediate result is being cached.
1000 loops, best of 3: 558 µs per loop
scipy version is 0.18.1, numpy is 1.11.1
norm. Matrix or vector norm. This function is able to return one of eight different matrix norms, or one of an infinite number of vector norms (described below), depending on the value of the ord parameter.
The norm of a vector is a measure of its distance from the origin in the vector space. To calculate the norm, you can either use Numpy or Scipy. Both offer a similar function to calculate the norm.
The norm is what is generally used to evaluate the error of a model. For instance it is used to calculate the error between the output of a neural network and what is expected (the actual label or value). You can think of the norm as the length of a vector. It is a function that maps a vector to a positive value.
Numpy is quite important in almost all scientific programming in python, including machine learning, bioinformatics, financial software, statistics etc. It provides some really cool functionality that is very well written and runs efficiently.
Looking the source code reveals that scipy
has its own norm
function, which wraps around the numpy.linalg.norm
or a BLAS function that is slower but handles floating point overflows better (see discussion on this PR).
However, in the example that you give it doesn't look like SciPy uses a BLAS function, so I do not think it's responsible for the time difference you see. But scipy does do some other checks before calling the numpy version of norm. In particular, that infinite check a = np.asarray_chkfinite(a)
is a suspect for causing the performance difference:
In [103]: %timeit normsp(a)
100 loops, best of 3: 5.1 ms per loop
In [104]: %timeit normnp(a)
1000 loops, best of 3: 744 µs per loop
In [105]: %timeit np.asarray_chkfinite(a)
100 loops, best of 3: 4.13 ms per loop
So it looks like np.asarray_chkfinite
roughly accounts for the difference in time taken to evaluate the norms.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With