Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Prepend Zero to Long Numpy Array

I have a very long 1D array that I'd like to calculate the cumulative sum for and then prepend a zero at the beginning of the resultant array.

import numpy as np

def padded_cumsum(x):
    cum_sum_x = np.cumsum(x)  # Creating a new array here is fine

    return np.pad(cum_sum_x, (1,0), mode='constant')  # Prepend a single zero w/ pad

x = np.array(range(2**25))
print padded_cumsum(x)

The function padded_cumsum will be called billions of times with varying array lengths and so I am trying to avoid any array copying as that is expensive. Also, I cannot alter the original array x by instantiating it with extra values/NaNs at the beginning/end. Since cum_sum_x needs to be created anyways, I suspect that I can sneak in the zero there by doing something hacky like:

def padded_cumsum(x):
    cum_sum_x = np.empty(x.shape[0]+1)
    cum_sum_x[0] = 0 
    cum_sum_x[1:] = np.cumsum(x)

    return np.pad(cum_sum_x, (1,0), mode='constant')  # Prepend a single zero w/ pad
like image 350
slaw Avatar asked Sep 20 '25 04:09

slaw


2 Answers

Use the out keyword on a hand-allocated array.

out = np.empty(len(x)+pad, dtype=yourdtype)
np.cumsum(x, out=out[pad:])
out[:pad] = 0
like image 193
Paul Panzer Avatar answered Sep 22 '25 00:09

Paul Panzer


You can cumsum in place :

def padcumsum(x):
    csum=np.hstack((0,x)) # 3x faster than pad.
    csum.cumsum(out=csum)
    return csum

For performance issue, you can insall numba :

@numba.njit
def perf(x):
    csum=np.empty(x.size+1,x.dtype)
    csum[0]=0
    for i in range(x.size):
        csum[i+1]=csum[i]+x[i]
    return csum

which is two times faster than padcumsum

like image 30
B. M. Avatar answered Sep 21 '25 23:09

B. M.