Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the difference between np.sum and np.add.reduce?

Tags:

python

numpy

What is the difference between np.sum and np.add.reduce?
While the docs are quite explicit:

For example, add.reduce() is equivalent to sum().

The performance of the two seems to be quite different: for relatively small array sizes add.reduce is about twice faster.

$ python -mtimeit -s"import numpy as np; a = np.random.rand(100); summ=np.sum" "summ(a)"
100000 loops, best of 3: 2.11 usec per loop
$ python -mtimeit -s"import numpy as np; a = np.random.rand(100); summ=np.add.reduce" "summ(a)"
1000000 loops, best of 3: 0.81 usec per loop

$ python -mtimeit -s"import numpy as np; a = np.random.rand(1000); summ=np.sum" "summ(a)"
100000 loops, best of 3: 2.78 usec per loop
$ python -mtimeit -s"import numpy as np; a = np.random.rand(1000); summ=np.add.reduce" "summ(a)"
1000000 loops, best of 3: 1.5 usec per loop

For larger array sizes, the difference seems to go away:

$ python -mtimeit -s"import numpy as np; a = np.random.rand(10000); summ=np.sum" "summ(a)"
100000 loops, best of 3: 10.7 usec per loop
$ python -mtimeit -s"import numpy as np; a = np.random.rand(10000); summ=np.add.reduce" "summ(a)"
100000 loops, best of 3: 9.2 usec per loop
like image 422
ev-br Avatar asked May 07 '13 13:05

ev-br


People also ask

What is the difference between NP sum and sum?

This is an extension to the the answer post above by Akavall. From that answer you can see that np. sum performs faster for np. array objects, whereas sum performs faster for list objects.

What is NP sum?

The numpy. sum() function is available in the NumPy package of Python. This function is used to compute the sum of all elements, the sum of each row, and the sum of each column of a given array. Essentially, this sum ups the elements of an array, takes the elements within a ndarray, and adds them together.

How does NumPy reduce work?

reduce() is equivalent to sum(). The array to act on. Axis or axes along which a reduction is performed. The default (axis = 0) is perform a reduction over the first dimension of the input array.

What do you get if you apply NumPy sum () to a list that contains only Boolean values?

sum receives an array of booleans as its argument, it'll sum each element (count True as 1 and False as 0) and return the outcome. for instance np. sum([True, True, False]) will output 2 :) Hope this helps.


3 Answers

Short answer: when the argument is a numpy array, np.sum ultimately calls add.reduce to do the work. The overhead of handling its argument and dispatching to add.reduce is why np.sum is slower.

Longer answer: np.sum is defined in numpy/core/fromnumeric.py. In the definition of np.sum, you'll see that the work is passed on to _methods._sum. That function, in _methods.py, is simply:

def _sum(a, axis=None, dtype=None, out=None, keepdims=False):
    return um.add.reduce(a, axis=axis, dtype=dtype,
                            out=out, keepdims=keepdims)

um is the module where the add ufunc is defined.

like image 137
Warren Weckesser Avatar answered Oct 01 '22 04:10

Warren Weckesser


There is actually one difference that might bite you if you were to blindly refactor from one to the other:

>>> import numpy as np
>>> a = np.arange(4).reshape(2, 2)
>>> 
>>> np.sum(a)
6
>>> np.add.reduce(a)
array([2, 4])
>>> 

The axis default values are different!

like image 25
Paul Panzer Avatar answered Oct 01 '22 02:10

Paul Panzer


To answer the question in the title, simply: When working with matrices, you will find an important distinction between the two functions:

np.sum (without specifying the axis) will return the sum of all elements in the matrix.

np.add.reduce (without specifying the axis) will return the sum along axis=0. That is, add.reduce(a) is equivalent to sum(a, axis=0)

However, both will return the same if you specify the axis. I'm posting as answer because I don't have enough rep to comment.

like image 1
Shrinidhi H R Avatar answered Oct 01 '22 04:10

Shrinidhi H R