What is the difference between np.sum
and np.add.reduce
?
While the docs are quite explicit:
For example, add.reduce() is equivalent to sum().
The performance of the two seems to be quite different: for relatively small array sizes add.reduce
is about twice faster.
$ python -mtimeit -s"import numpy as np; a = np.random.rand(100); summ=np.sum" "summ(a)"
100000 loops, best of 3: 2.11 usec per loop
$ python -mtimeit -s"import numpy as np; a = np.random.rand(100); summ=np.add.reduce" "summ(a)"
1000000 loops, best of 3: 0.81 usec per loop
$ python -mtimeit -s"import numpy as np; a = np.random.rand(1000); summ=np.sum" "summ(a)"
100000 loops, best of 3: 2.78 usec per loop
$ python -mtimeit -s"import numpy as np; a = np.random.rand(1000); summ=np.add.reduce" "summ(a)"
1000000 loops, best of 3: 1.5 usec per loop
For larger array sizes, the difference seems to go away:
$ python -mtimeit -s"import numpy as np; a = np.random.rand(10000); summ=np.sum" "summ(a)"
100000 loops, best of 3: 10.7 usec per loop
$ python -mtimeit -s"import numpy as np; a = np.random.rand(10000); summ=np.add.reduce" "summ(a)"
100000 loops, best of 3: 9.2 usec per loop
This is an extension to the the answer post above by Akavall. From that answer you can see that np. sum performs faster for np. array objects, whereas sum performs faster for list objects.
The numpy. sum() function is available in the NumPy package of Python. This function is used to compute the sum of all elements, the sum of each row, and the sum of each column of a given array. Essentially, this sum ups the elements of an array, takes the elements within a ndarray, and adds them together.
reduce() is equivalent to sum(). The array to act on. Axis or axes along which a reduction is performed. The default (axis = 0) is perform a reduction over the first dimension of the input array.
sum receives an array of booleans as its argument, it'll sum each element (count True as 1 and False as 0) and return the outcome. for instance np. sum([True, True, False]) will output 2 :) Hope this helps.
Short answer: when the argument is a numpy array, np.sum
ultimately calls add.reduce
to do the work. The overhead of handling its argument and dispatching to add.reduce
is why np.sum
is slower.
Longer answer:
np.sum
is defined in numpy/core/fromnumeric.py
. In the definition of np.sum
, you'll
see that the work is passed on to _methods._sum
. That function, in _methods.py
, is simply:
def _sum(a, axis=None, dtype=None, out=None, keepdims=False):
return um.add.reduce(a, axis=axis, dtype=dtype,
out=out, keepdims=keepdims)
um
is the module where the add
ufunc is defined.
There is actually one difference that might bite you if you were to blindly refactor from one to the other:
>>> import numpy as np
>>> a = np.arange(4).reshape(2, 2)
>>>
>>> np.sum(a)
6
>>> np.add.reduce(a)
array([2, 4])
>>>
The axis
default values are different!
To answer the question in the title, simply: When working with matrices, you will find an important distinction between the two functions:
np.sum
(without specifying the axis) will return the sum of all elements in the matrix.
np.add.reduce
(without specifying the axis) will return the sum along axis=0.
That is, add.reduce(a) is equivalent to sum(a, axis=0)
However, both will return the same if you specify the axis. I'm posting as answer because I don't have enough rep to comment.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With