Let's say I want to do an element-wise sum of a list of numpy arrays:
tosum = [rand(100,100) for n in range(10)]
I've been looking for the best way to do this. It seems like numpy.sum is awful:
timeit.timeit('sum(array(tosum), axis=0)',
setup='from numpy import sum; from __main__ import tosum, array',
number=10000)
75.02289700508118
timeit.timeit('sum(tosum, axis=0)',
setup='from numpy import sum; from __main__ import tosum',
number=10000)
78.99106407165527
Reduce is much faster (to the tune of nearly two orders of magnitude):
timeit.timeit('reduce(add,tosum)',
setup='from numpy import add; from __main__ import tosum',
number=10000)
1.131795883178711
It looks like reduce even has a meaningful lead over the non-numpy sum (note that these are for 1e6 runs rather than 1e4 for the above times):
timeit.timeit('reduce(add,tosum)',
setup='from numpy import add; from __main__ import tosum',
number=1000000)
109.98814797401428
timeit.timeit('sum(tosum)',
setup='from __main__ import tosum',
number=1000000)
125.52461504936218
Are there other methods I should try? Can anyone explain the rankings?
Edit
numpy.sum is definitely faster if the list is turned into a numpy array first:
tosum2 = array(tosum)
timeit.timeit('sum(tosum2, axis=0)',
setup='from numpy import sum; from __main__ import tosum2',
number=10000)
1.1545608043670654
However, I'm only interested in doing a sum once, so turning the array into a numpy array would still incur a real performance penalty.
sum() in Python. numpy. sum(arr, axis, dtype, out) : This function returns the sum of array elements over the specified axis.
Method 2: Using the built-in function sum(). Python provides an inbuilt function sum() which sums up the numbers in the list. iterable: iterable can be anything list, tuples or dictionaries, but most importantly it should be numbered.
Method 2: Using the sum() function in NumPy, numpy. sum(arr, axis, dtype, out) function returns the sum of array elements over the specified axis. To compute the sum of all columns the axis argument should be 0 in sum() function.
Numpy sum is not awful, you are simply using numpy in the wrong way. You won't be able to make use of numpy's speed advantage if you combine normal python, functions (including reduce
!), loops and lists with numpy arrays. If you want your code to be fast, you must only use numpy.
Since you did not specify any imports in your code snippet, I am not sure what the function randn
is doing or where it comes from, so I just assumed that tosum
should just represent a list of 10 matrices of some random numbers. The following code snippet shows that numpy is definitely not as slow as you claim it to be:
import numpy as np
import timeit
def test_np_sum(n=10):
# n represents the numbers of matrices to sum up element wise
tosum = np.random.randint(0, 100, size=(n, 10, 10)) # n 10x10 matrices, shape = (n, 10, 10)
summed = np.sum(tosum, axis=0) # shape = (10, 10)
And then testing it:
timeit.timeit('test_np_sum()', number=10000, setup='from __main__ import test_np_sum')
0.8418250999999941
The following is competitive with reduce
, and is faster if the tosum
list is long enough. However, it's not a lot faster, and it is more code. (reduce(add, tosum)
sure is pretty.)
def loop_inplace_sum(arrlist):
# assumes len(arrlist) > 0
sum = arrlist[0].copy()
for a in arrlist[1:]:
sum += a
return sum
Timing for the original tosum
. reduce(add, tosum)
is faster:
In [128]: tosum = [rand(100,100) for n in range(10)]
In [129]: %timeit reduce(add, tosum)
10000 loops, best of 3: 73.5 µs per loop
In [130]: %timeit loop_inplace_sum(tosum)
10000 loops, best of 3: 78 µs per loop
Timing for a much longer list of arrays. Now loop_inplace_sum
is faster.
In [131]: tosum = [rand(100,100) for n in range(500)]
In [132]: %timeit reduce(add, tosum)
100 loops, best of 3: 5.09 ms per loop
In [133]: %timeit loop_inplace_sum(tosum)
100 loops, best of 3: 4.4 ms per loop
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With