Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Performance difference between scipy and numpy norm

I have always assumed scipy.linalg.norm() and numpy.linalg.norm() to be equivalent (scipy version used to not accept an axis argument, but now it does). However the following simple examples yields significantly different performances: what is the reason behind that?

In [1]: from scipy.linalg import norm as normsp
In [2]: from numpy.linalg import norm as normnp 
In [3]: import numpy as np
In [4]: a = np.random.random(size=(1000, 2000))

In [5]: %timeit normsp(a)
The slowest run took 5.69 times longer than the fastest. This could mean that an intermediate result is being cached.
100 loops, best of 3: 2.85 ms per loop

In [6]: %timeit normnp(a)
The slowest run took 6.39 times longer than the fastest. This could mean that an intermediate result is being cached.
1000 loops, best of 3: 558 µs per loop

scipy version is 0.18.1, numpy is 1.11.1

like image 668
P. Camilleri Avatar asked Oct 20 '16 11:10

P. Camilleri


People also ask

What does Scipy Linalg norm do?

norm. Matrix or vector norm. This function is able to return one of eight different matrix norms, or one of an infinite number of vector norms (described below), depending on the value of the ord parameter.

What is NumPy norm?

The norm of a vector is a measure of its distance from the origin in the vector space. To calculate the norm, you can either use Numpy or Scipy. Both offer a similar function to calculate the norm.

What is norm in Python?

The norm is what is generally used to evaluate the error of a model. For instance it is used to calculate the error between the output of a neural network and what is expected (the actual label or value). You can think of the norm as the length of a vector. It is a function that maps a vector to a positive value.

Is NumPy worth learning?

Numpy is quite important in almost all scientific programming in python, including machine learning, bioinformatics, financial software, statistics etc. It provides some really cool functionality that is very well written and runs efficiently.


1 Answers

Looking the source code reveals that scipy has its own norm function, which wraps around the numpy.linalg.norm or a BLAS function that is slower but handles floating point overflows better (see discussion on this PR).

However, in the example that you give it doesn't look like SciPy uses a BLAS function, so I do not think it's responsible for the time difference you see. But scipy does do some other checks before calling the numpy version of norm. In particular, that infinite check a = np.asarray_chkfinite(a) is a suspect for causing the performance difference:

In [103]: %timeit normsp(a)
100 loops, best of 3: 5.1 ms per loop

In [104]: %timeit normnp(a)
1000 loops, best of 3: 744 µs per loop

In [105]: %timeit np.asarray_chkfinite(a)
100 loops, best of 3: 4.13 ms per loop

So it looks like np.asarray_chkfinite roughly accounts for the difference in time taken to evaluate the norms.

like image 173
Vlas Sokolov Avatar answered Oct 02 '22 07:10

Vlas Sokolov