I want to write a function that takes a flattened array as input and returns an array of equal length containing the sums of the previous n elements from the input array, with the initial <code>n - 1</code> elements of the output array set to <code>NaN</code>. For example if the array has ten <code>elements = [2, 4, 3, 7, 6, 1, 9, 4, 6, 5]</code> and <code>n = 3</code> then the resulting array should be <code>[NaN, NaN, 9, 14, 16, 14, 16, 14, 19, 15]</code>. One way I've come up with to do this: <pre class="prettyprint"><code>def sum_n_values(flat_array, n): sums = np.full(flat_array.shape, np.NaN) for i in range(n - 1, flat_array.shape[0]): sums[i] = np.sum(flat_array[i - n + 1:i + 1]) return sums </code></pre> Is there a better/more efficient/more "Pythonic" way to do this? Thanks in advance for your help.

You can make use of <code>np.cumsum</code>, and take the difference of the <code>cumsum</code>ed array and a shifted version of it: <pre class="prettyprint"><code>n = 3 arr = np.array([2, 4, 3, 7, 6, 1, 9, 4, 6, 5]) sum_arr = arr.cumsum() shifted_sum_arr = np.concatenate([[np.NaN]*(n-1), [0], sum_arr[:-n]]) sum_arr => array([ 2, 6, 9, 16, 22, 23, 32, 36, 42, 47]) shifted_sum_arr => array([ nan, nan, 0., 2., 6., 9., 16., 22., 23., 32.]) sum_arr - shifted_sum_arr => array([ nan, nan, 9., 14., 16., 14., 16., 14., 19., 15.]) </code></pre> IMO, this is a more numpyish way to do this, mainly because it avoids the loop. <hr> Timings <pre class="prettyprint"><code>def cumsum_app(flat_array, n): sum_arr = flat_array.cumsum() shifted_sum_arr = np.concatenate([[np.NaN]*(n-1), [0], sum_arr[:-n]]) return sum_arr - shifted_sum_arr flat_array = np.random.randint(0,9,(100000)) %timeit cumsum_app(flat_array,10) 1000 loops, best of 3: 985 us per loop %timeit cumsum_app(flat_array,100) 1000 loops, best of 3: 963 us per loop </code></pre>

You are basically performing <code>1D convolution</code> there, so you can use <code>np.convolve</code>, like so - <pre class="prettyprint"><code># Get the valid sliding summations with 1D convolution vals = np.convolve(flat_array,np.ones(n),mode='valid') # Pad with NaNs at the start if needed out = np.pad(vals,(n-1,0),'constant',constant_values=(np.nan)) </code></pre> Sample run - <pre class="prettyprint"><code>In [110]: flat_array Out[110]: array([2, 4, 3, 7, 6, 1, 9, 4, 6, 5]) In [111]: n = 3 In [112]: vals = np.convolve(flat_array,np.ones(n),mode='valid') ...: out = np.pad(vals,(n-1,0),'constant',constant_values=(np.nan)) ...: In [113]: vals Out[113]: array([ 9., 14., 16., 14., 16., 14., 19., 15.]) In [114]: out Out[114]: array([ nan, nan, 9., 14., 16., 14., 16., 14., 19., 15.]) </code></pre> For 1D convolution, one can also use <code>Scipy's implementation</code>. The runtimes with Scipy version seemed better for a large window size, as also the runtime tests listed next would try to investigate. The Scipy version for getting <code>vals</code> would be - <pre class="prettyprint"><code>from scipy import signal vals = signal.convolve(flat_array,np.ones(n),mode='valid') </code></pre> The <code>NaNs</code> padding operation could be replaced by <code>np.hstack</code> : <code>np.hstack(([np.nan]*(n-1),vals))</code> for better performance. <hr> Runtime tests - <pre class="prettyprint"><code>In [238]: def original_app(flat_array,n): ...: sums = np.full(flat_array.shape, np.NaN) ...: for i in range(n - 1, flat_array.shape[0]): ...: sums[i] = np.sum(flat_array[i - n + 1:i + 1]) ...: return sums ...: ...: def vectorized_app1(flat_array,n): ...: vals = np.convolve(flat_array,np.ones(n),mode='valid') ...: return np.hstack(([np.nan]*(n-1),vals)) ...: ...: def vectorized_app2(flat_array,n): ...: vals = signal.convolve(flat_array,np.ones(3),mode='valid') ...: return np.hstack(([np.nan]*(n-1),vals)) ...: In [239]: flat_array = np.random.randint(0,9,(100000)) In [240]: %timeit original_app(flat_array,10) 1 loops, best of 3: 833 ms per loop In [241]: %timeit vectorized_app1(flat_array,10) 1000 loops, best of 3: 1.96 ms per loop In [242]: %timeit vectorized_app2(flat_array,10) 100 loops, best of 3: 13.1 ms per loop In [243]: %timeit original_app(flat_array,100) 1 loops, best of 3: 836 ms per loop In [244]: %timeit vectorized_app1(flat_array,100) 100 loops, best of 3: 16.5 ms per loop In [245]: %timeit vectorized_app2(flat_array,100) 100 loops, best of 3: 13.1 ms per loop </code></pre>

The other answers here are probably closer to what you're looking for in terms of speed and memory, but for completeness you can also use a list comprehension to build your array: <pre class="prettyprint"><code>a = np.array([2, 4, 3, 7, 6, 1, 9, 4, 6, 5]) N, n = a.shape[0], 3 np.array([np.NaN]*(n-1) + [np.sum(a[j:j+n]) for j in range(N-n+1)]) </code></pre> returns: <pre class="prettyprint"><code>array([ nan, nan, 9., 14., 16., 14., 16., 14., 19., 15.]) </code></pre>

Python/numpy: Most efficient way to sum n elements of an array, so that each output element is the sum of the previous n input elements?

Tags:

performance

python

arrays

algorithm

numpy

I want to write a function that takes a flattened array as input and returns an array of equal length containing the sums of the previous n elements from the input array, with the initial n - 1 elements of the output array set to NaN.

For example if the array has ten elements = [2, 4, 3, 7, 6, 1, 9, 4, 6, 5] and n = 3 then the resulting array should be [NaN, NaN, 9, 14, 16, 14, 16, 14, 19, 15].

One way I've come up with to do this:

def sum_n_values(flat_array, n): 

    sums = np.full(flat_array.shape, np.NaN)
    for i in range(n - 1, flat_array.shape[0]):
        sums[i] = np.sum(flat_array[i - n + 1:i + 1])
    return sums

Is there a better/more efficient/more "Pythonic" way to do this?

Thanks in advance for your help.

558

asked Dec 30 '15 20:12

James Adams

3 Answers

You can make use of np.cumsum, and take the difference of the cumsumed array and a shifted version of it:

n = 3
arr = np.array([2, 4, 3, 7, 6, 1, 9, 4, 6, 5])
sum_arr = arr.cumsum()
shifted_sum_arr = np.concatenate([[np.NaN]*(n-1), [0],  sum_arr[:-n]])
sum_arr
=> array([ 2,  6,  9, 16, 22, 23, 32, 36, 42, 47])
shifted_sum_arr
=> array([ nan,  nan,   0.,   2.,   6.,   9.,  16.,  22.,  23.,  32.])
sum_arr - shifted_sum_arr
=> array([ nan,  nan,   9.,  14.,  16.,  14.,  16.,  14.,  19.,  15.])

IMO, this is a more numpyish way to do this, mainly because it avoids the loop.

Timings

def cumsum_app(flat_array, n):
    sum_arr = flat_array.cumsum()
    shifted_sum_arr = np.concatenate([[np.NaN]*(n-1), [0],  sum_arr[:-n]])
    return sum_arr - shifted_sum_arr

flat_array = np.random.randint(0,9,(100000))
%timeit cumsum_app(flat_array,10)
1000 loops, best of 3: 985 us per loop
%timeit cumsum_app(flat_array,100)
1000 loops, best of 3: 963 us per loop

107

answered Sep 28 '22 14:09

shx2

You are basically performing 1D convolution there, so you can use np.convolve, like so -

# Get the valid sliding summations with 1D convolution
vals = np.convolve(flat_array,np.ones(n),mode='valid')

# Pad with NaNs at the start if needed  
out = np.pad(vals,(n-1,0),'constant',constant_values=(np.nan))

Sample run -

In [110]: flat_array
Out[110]: array([2, 4, 3, 7, 6, 1, 9, 4, 6, 5])

In [111]: n = 3

In [112]: vals = np.convolve(flat_array,np.ones(n),mode='valid')
     ...: out = np.pad(vals,(n-1,0),'constant',constant_values=(np.nan))
     ...: 

In [113]: vals
Out[113]: array([  9.,  14.,  16.,  14.,  16.,  14.,  19.,  15.])

In [114]: out
Out[114]: array([ nan,  nan,   9.,  14.,  16.,  14.,  16.,  14.,  19.,  15.])

For 1D convolution, one can also use Scipy's implementation. The runtimes with Scipy version seemed better for a large window size, as also the runtime tests listed next would try to investigate. The Scipy version for getting vals would be -

from scipy import signal
vals = signal.convolve(flat_array,np.ones(n),mode='valid')

The NaNs padding operation could be replaced by np.hstack : np.hstack(([np.nan]*(n-1),vals)) for better performance.

Runtime tests -

In [238]: def original_app(flat_array,n):
     ...:     sums = np.full(flat_array.shape, np.NaN)
     ...:     for i in range(n - 1, flat_array.shape[0]):
     ...:         sums[i] = np.sum(flat_array[i - n + 1:i + 1])
     ...:     return sums
     ...: 
     ...: def vectorized_app1(flat_array,n):
     ...:     vals = np.convolve(flat_array,np.ones(n),mode='valid')
     ...:     return np.hstack(([np.nan]*(n-1),vals))
     ...: 
     ...: def vectorized_app2(flat_array,n):
     ...:     vals = signal.convolve(flat_array,np.ones(3),mode='valid')
     ...:     return np.hstack(([np.nan]*(n-1),vals))
     ...: 

In [239]: flat_array = np.random.randint(0,9,(100000))

In [240]: %timeit original_app(flat_array,10)
1 loops, best of 3: 833 ms per loop

In [241]: %timeit vectorized_app1(flat_array,10)
1000 loops, best of 3: 1.96 ms per loop

In [242]: %timeit vectorized_app2(flat_array,10)
100 loops, best of 3: 13.1 ms per loop

In [243]: %timeit original_app(flat_array,100)
1 loops, best of 3: 836 ms per loop

In [244]: %timeit vectorized_app1(flat_array,100)
100 loops, best of 3: 16.5 ms per loop

In [245]: %timeit vectorized_app2(flat_array,100)
100 loops, best of 3: 13.1 ms per loop

answered Sep 28 '22 16:09

Divakar

The other answers here are probably closer to what you're looking for in terms of speed and memory, but for completeness you can also use a list comprehension to build your array:

a = np.array([2, 4, 3, 7, 6, 1, 9, 4, 6, 5])
N, n = a.shape[0], 3
np.array([np.NaN]*(n-1) + [np.sum(a[j:j+n]) for j in range(N-n+1)])

returns:

array([ nan,  nan,   9.,  14.,  16.,  14.,  16.,  14.,  19.,  15.])