I have a very long 1D array that I'd like to calculate the cumulative sum for and then prepend a zero at the beginning of the resultant array.
import numpy as np
def padded_cumsum(x):
cum_sum_x = np.cumsum(x) # Creating a new array here is fine
return np.pad(cum_sum_x, (1,0), mode='constant') # Prepend a single zero w/ pad
x = np.array(range(2**25))
print padded_cumsum(x)
The function padded_cumsum
will be called billions of times with varying array lengths and so I am trying to avoid any array copying as that is expensive. Also, I cannot alter the original array x
by instantiating it with extra values/NaNs at the beginning/end. Since cum_sum_x
needs to be created anyways, I suspect that I can sneak in the zero there by doing something hacky like:
def padded_cumsum(x):
cum_sum_x = np.empty(x.shape[0]+1)
cum_sum_x[0] = 0
cum_sum_x[1:] = np.cumsum(x)
return np.pad(cum_sum_x, (1,0), mode='constant') # Prepend a single zero w/ pad
Use the out
keyword on a hand-allocated array.
out = np.empty(len(x)+pad, dtype=yourdtype)
np.cumsum(x, out=out[pad:])
out[:pad] = 0
You can cumsum in place :
def padcumsum(x):
csum=np.hstack((0,x)) # 3x faster than pad.
csum.cumsum(out=csum)
return csum
For performance issue, you can insall numba :
@numba.njit
def perf(x):
csum=np.empty(x.size+1,x.dtype)
csum[0]=0
for i in range(x.size):
csum[i+1]=csum[i]+x[i]
return csum
which is two times faster than padcumsum
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With