I am looking for a succinct way to go from:
a = numpy.array([1,4,1,numpy.nan,2,numpy.nan])
to:
b = numpy.array([1,5,6,numpy.nan,8,numpy.nan])
The best I can do currently is:
b = numpy.insert(numpy.cumsum(a[numpy.isfinite(a)]), (numpy.argwhere(numpy.isnan(a)) - numpy.arange(len(numpy.argwhere(numpy.isnan(a))))), numpy.nan)
Is there a shorter way to accomplish the same? What about doing a cumsum along an axis of a 2D array?
Pandas
is a library build on top of numpy
. It's
Series
class has a cumsum
method, which preserves the nan
's and is considerably faster than the solution proposed by DSM:
In [15]: a = arange(10000.0)
In [16]: a[1] = np.nan
In [17]: %timeit a*0 + np.nan_to_num(a).cumsum()
1000 loops, best of 3: 465 us per loop
In [18] s = pd.Series(a)
In [19]: s.cumsum()
Out[19]:
0 0
1 NaN
2 2
3 5
...
9996 49965005
9997 49975002
9998 49985000
9999 49994999
Length: 10000
In [20]: %timeit s.cumsum()
10000 loops, best of 3: 175 us per loop
How about (for not-too-big arrays):
In [34]: import numpy as np
In [35]: a = np.array([1,4,1,np.nan,2,np.nan])
In [36]: a*0 + np.nan_to_num(a).cumsum()
Out[36]: array([ 1., 5., 6., nan, 8., nan])
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With