Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python/numpy: Most efficient way to sum n elements of an array, so that each output element is the sum of the previous n input elements?

I want to write a function that takes a flattened array as input and returns an array of equal length containing the sums of the previous n elements from the input array, with the initial n - 1 elements of the output array set to NaN.

For example if the array has ten elements = [2, 4, 3, 7, 6, 1, 9, 4, 6, 5] and n = 3 then the resulting array should be [NaN, NaN, 9, 14, 16, 14, 16, 14, 19, 15].

One way I've come up with to do this:

def sum_n_values(flat_array, n): 

    sums = np.full(flat_array.shape, np.NaN)
    for i in range(n - 1, flat_array.shape[0]):
        sums[i] = np.sum(flat_array[i - n + 1:i + 1])
    return sums

Is there a better/more efficient/more "Pythonic" way to do this?

Thanks in advance for your help.

like image 558
James Adams Avatar asked Dec 30 '15 20:12

James Adams


People also ask

How do you sum all elements in a NumPy array?

sum() in Python. The numpy. sum() function is available in the NumPy package of Python. This function is used to compute the sum of all elements, the sum of each row, and the sum of each column of a given array.

Why NumPy sum is faster than sum?

Clearly adding C values from a C array is much faster than adding Python objects, which is why the NumPy functions can be much faster (see the second plot above, the NumPy functions on arrays beat the Python sum by far for large arrays).

Which is more efficient a Python list or a NumPy array?

Even for the delete operation, the Numpy array is faster. As the array size increase, Numpy gets around 30 times faster than Python List. Because the Numpy array is densely packed in memory due to its homogeneous type, it also frees the memory faster.

Is appending to NumPy array efficient?

Appending to numpy arrays is very inefficient. This is because the interpreter needs to find and assign memory for the entire array at every single step. Depending on the application, there are much better strategies. If you know the length in advance, it is best to pre-allocate the array using a function like np.


3 Answers

You can make use of np.cumsum, and take the difference of the cumsumed array and a shifted version of it:

n = 3
arr = np.array([2, 4, 3, 7, 6, 1, 9, 4, 6, 5])
sum_arr = arr.cumsum()
shifted_sum_arr = np.concatenate([[np.NaN]*(n-1), [0],  sum_arr[:-n]])
sum_arr
=> array([ 2,  6,  9, 16, 22, 23, 32, 36, 42, 47])
shifted_sum_arr
=> array([ nan,  nan,   0.,   2.,   6.,   9.,  16.,  22.,  23.,  32.])
sum_arr - shifted_sum_arr
=> array([ nan,  nan,   9.,  14.,  16.,  14.,  16.,  14.,  19.,  15.])

IMO, this is a more numpyish way to do this, mainly because it avoids the loop.


Timings

def cumsum_app(flat_array, n):
    sum_arr = flat_array.cumsum()
    shifted_sum_arr = np.concatenate([[np.NaN]*(n-1), [0],  sum_arr[:-n]])
    return sum_arr - shifted_sum_arr

flat_array = np.random.randint(0,9,(100000))
%timeit cumsum_app(flat_array,10)
1000 loops, best of 3: 985 us per loop
%timeit cumsum_app(flat_array,100)
1000 loops, best of 3: 963 us per loop
like image 107
shx2 Avatar answered Sep 28 '22 14:09

shx2


You are basically performing 1D convolution there, so you can use np.convolve, like so -

# Get the valid sliding summations with 1D convolution
vals = np.convolve(flat_array,np.ones(n),mode='valid')

# Pad with NaNs at the start if needed  
out = np.pad(vals,(n-1,0),'constant',constant_values=(np.nan))

Sample run -

In [110]: flat_array
Out[110]: array([2, 4, 3, 7, 6, 1, 9, 4, 6, 5])

In [111]: n = 3

In [112]: vals = np.convolve(flat_array,np.ones(n),mode='valid')
     ...: out = np.pad(vals,(n-1,0),'constant',constant_values=(np.nan))
     ...: 

In [113]: vals
Out[113]: array([  9.,  14.,  16.,  14.,  16.,  14.,  19.,  15.])

In [114]: out
Out[114]: array([ nan,  nan,   9.,  14.,  16.,  14.,  16.,  14.,  19.,  15.])

For 1D convolution, one can also use Scipy's implementation. The runtimes with Scipy version seemed better for a large window size, as also the runtime tests listed next would try to investigate. The Scipy version for getting vals would be -

from scipy import signal
vals = signal.convolve(flat_array,np.ones(n),mode='valid')

The NaNs padding operation could be replaced by np.hstack : np.hstack(([np.nan]*(n-1),vals)) for better performance.


Runtime tests -

In [238]: def original_app(flat_array,n):
     ...:     sums = np.full(flat_array.shape, np.NaN)
     ...:     for i in range(n - 1, flat_array.shape[0]):
     ...:         sums[i] = np.sum(flat_array[i - n + 1:i + 1])
     ...:     return sums
     ...: 
     ...: def vectorized_app1(flat_array,n):
     ...:     vals = np.convolve(flat_array,np.ones(n),mode='valid')
     ...:     return np.hstack(([np.nan]*(n-1),vals))
     ...: 
     ...: def vectorized_app2(flat_array,n):
     ...:     vals = signal.convolve(flat_array,np.ones(3),mode='valid')
     ...:     return np.hstack(([np.nan]*(n-1),vals))
     ...: 

In [239]: flat_array = np.random.randint(0,9,(100000))

In [240]: %timeit original_app(flat_array,10)
1 loops, best of 3: 833 ms per loop

In [241]: %timeit vectorized_app1(flat_array,10)
1000 loops, best of 3: 1.96 ms per loop

In [242]: %timeit vectorized_app2(flat_array,10)
100 loops, best of 3: 13.1 ms per loop

In [243]: %timeit original_app(flat_array,100)
1 loops, best of 3: 836 ms per loop

In [244]: %timeit vectorized_app1(flat_array,100)
100 loops, best of 3: 16.5 ms per loop

In [245]: %timeit vectorized_app2(flat_array,100)
100 loops, best of 3: 13.1 ms per loop
like image 38
Divakar Avatar answered Sep 28 '22 16:09

Divakar


The other answers here are probably closer to what you're looking for in terms of speed and memory, but for completeness you can also use a list comprehension to build your array:

a = np.array([2, 4, 3, 7, 6, 1, 9, 4, 6, 5])
N, n = a.shape[0], 3
np.array([np.NaN]*(n-1) + [np.sum(a[j:j+n]) for j in range(N-n+1)])

returns:

array([ nan,  nan,   9.,  14.,  16.,  14.,  16.,  14.,  19.,  15.])
like image 27
xnx Avatar answered Sep 28 '22 16:09

xnx