Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to reset cumulative sum every time there is a NaN in a pandas dataframe?

If I have a Pandas data frame like this:

     1   2   3   4   5   6   7
 1  NaN  1   1   1  NaN  1   1
 2  NaN NaN  1   1   1   1   1 
 3  NaN NaN NaN  1  NaN  1   1
 4   1   1  NaN NaN  1   1  NaN

How do I do a cumulative sum such that the count resets every time there is a NaN value in the row? Such that I get something like this:

     1   2   3   4   5   6   7
 1  NaN  1   2   3  NaN  1   2
 2  NaN NaN  1   2   3   4   5 
 3  NaN NaN NaN  1  NaN  1   2
 4   1   2  NaN NaN  1   2  NaN
like image 965
Zmann3000 Avatar asked Dec 30 '25 09:12

Zmann3000


1 Answers

You could do:

# compute mask where np.nan = True
mask = pd.isna(df).astype(bool)

# compute cumsum across rows fillna with ffill
cumulative = df.cumsum(1).fillna(method='ffill', axis=1).fillna(0)

# get the values of cumulative where nan is True use the same method
restart = cumulative[mask].fillna(method='ffill', axis=1).fillna(0)

# set the result
result = (cumulative - restart)
result[mask] = np.nan

# display the result
print(result)

Output

     1    2    3    4    5    6    7
0  NaN  1.0  2.0  3.0  NaN  1.0  2.0
1  NaN  NaN  1.0  2.0  3.0  4.0  5.0
2  NaN  NaN  NaN  1.0  NaN  1.0  2.0
3  1.0  2.0  NaN  NaN  1.0  2.0  NaN
like image 59
Dani Mesejo Avatar answered Dec 31 '25 23:12

Dani Mesejo



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!