How to reset cumulative sum every time there is a NaN in a pandas dataframe?

Question

If I have a Pandas data frame like this:

     1   2   3   4   5   6   7
 1  NaN  1   1   1  NaN  1   1
 2  NaN NaN  1   1   1   1   1 
 3  NaN NaN NaN  1  NaN  1   1
 4   1   1  NaN NaN  1   1  NaN

How do I do a cumulative sum such that the count resets every time there is a NaN value in the row? Such that I get something like this:

     1   2   3   4   5   6   7
 1  NaN  1   2   3  NaN  1   2
 2  NaN NaN  1   2   3   4   5 
 3  NaN NaN NaN  1  NaN  1   2
 4   1   2  NaN NaN  1   2  NaN

Dani Mesejo · Accepted Answer

You could do:

# compute mask where np.nan = True
mask = pd.isna(df).astype(bool)

# compute cumsum across rows fillna with ffill
cumulative = df.cumsum(1).fillna(method='ffill', axis=1).fillna(0)

# get the values of cumulative where nan is True use the same method
restart = cumulative[mask].fillna(method='ffill', axis=1).fillna(0)

# set the result
result = (cumulative - restart)
result[mask] = np.nan

# display the result
print(result)

Output

     1    2    3    4    5    6    7
0  NaN  1.0  2.0  3.0  NaN  1.0  2.0
1  NaN  NaN  1.0  2.0  3.0  4.0  5.0
2  NaN  NaN  NaN  1.0  NaN  1.0  2.0
3  1.0  2.0  NaN  NaN  1.0  2.0  NaN

How to reset cumulative sum every time there is a NaN in a pandas dataframe?

Tags:

python

pandas

python-2.7

Zmann3000

1 Answers

Dani Mesejo

Recent Activity

Donate For Us

How to reset cumulative sum every time there is a NaN in a pandas dataframe?

Tags:

python

pandas

python-2.7

Zmann3000

1 Answers

Dani Mesejo

Related questions

Recent Activity

Donate For Us