If I have a pandas.core.series.Series
named ts
of either 1's or NaN's like this:
3382 NaN
3381 NaN
...
3369 NaN
3368 NaN
...
15 1
10 NaN
11 1
12 1
13 1
9 NaN
8 NaN
7 NaN
6 NaN
3 NaN
4 1
5 1
2 NaN
1 NaN
0 NaN
I would like to calculate cumsum of this serie but it should be reset (set to zero) at the location of the NaNs like below:
3382 0
3381 0
...
3369 0
3368 0
...
15 1
10 0
11 1
12 2
13 3
9 0
8 0
7 0
6 0
3 0
4 1
5 2
2 0
1 0
0 0
Ideally I would like to have a vectorized solution !
I ever see a similar question with Matlab : Matlab cumsum reset at NaN?
but I don't know how to translate this line d = diff([0 c(n)]);
A simple Numpy translation of your Matlab code is this:
import numpy as np
v = np.array([1., 1., 1., np.nan, 1., 1., 1., 1., np.nan, 1.])
n = np.isnan(v)
a = ~n
c = np.cumsum(a)
d = np.diff(np.concatenate(([0.], c[n])))
v[n] = -d
np.cumsum(v)
Executing this code returns the result array([ 1., 2., 3., 0., 1., 2., 3., 4., 0., 1.])
. This solution will only be as valid as the original one, but maybe it will help you come up with something better if it isn't sufficient for your purposes.
Even more pandas-onic way to do it:
v = pd.Series([1., 3., 1., np.nan, 1., 1., 1., 1., np.nan, 1.])
cumsum = v.cumsum().fillna(method='pad')
reset = -cumsum[v.isnull()].diff().fillna(cumsum)
result = v.where(v.notnull(), reset).cumsum()
Contrary to the matlab code, this also works for values different from 1.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With