Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Sum of Squares - np.inner vs squaring first, then summing

Tags:

python

numpy

I was surprised that calling np.inner to compute a sum of squares was about 5x faster than calling np.sum on a pre-computed array of squares:

sum of squares code

Any insights into this behavior? I'm actually interested in a very fast implementation of a sum of squares, so those thoughts are welcome, too.

like image 412
bcf Avatar asked Jun 16 '16 15:06

bcf


People also ask

What is the difference between NP sum and sum?

Pythons sum iterates over the iterable (in this case the list or array) and adds all elements. NumPys sum method iterates over the stored C array and adds these C values and finally wraps that value in a Python type (in this case numpy. int32 (or numpy. int64 ) and returns it.

How does NP sum work?

Essentially, the NumPy sum function sums up the elements of an array. It just takes the elements within a NumPy array (an ndarray object) and adds them together. Having said that, it can get a little more complicated. It's possible to also add up the rows or add up the columns of an array.

What do you get if you apply NumPy sum () to a list that contains only Boolean values?

sum receives an array of booleans as its argument, it'll sum each element (count True as 1 and False as 0) and return the outcome. for instance np. sum([True, True, False]) will output 2 :) Hope this helps.


1 Answers

To check in which modules np.inner and np.sum are implemented I type

>>> np.inner.__module__
'numpy.core.multiarray'
>>> np.sum.__module__
'numpy.core.fromnumeric'
>>> np.__file__
'/Users/uweschmitt/venv_so/lib/python3.5/site-packages/numpy/__init__.py'

If you inspect the actual files, you can see that numpy.core.multiarray is a pure C module whereas numpy.core.fromnumeric first does some checks and conversions in Python before a second Python function and then a pure C implementation for the actual summation is called.

I suspect that this overhead from the Python interpreter explains the observed timing differences.

To prove my assumption I run the timing with a larger array and get

In [8]: a = np.random.random(1000000)
In [9]: %timeit np.inner(a, a)
1000 loops, best of 3: 673 µs per loop
In [10]: %timeit np.sum(a)
1000 loops, best of 3: 584 µs per loop

Now run times are quite similar and change a little if you repeat the statements, sometimes np.sum wins, somtimes np.inner.

For the big array the actual work of np.sum is done in C and the constant time overhead from the Python interpreter is negligible.

like image 67
rocksportrocker Avatar answered Oct 16 '22 04:10

rocksportrocker