I am trying to calculate some statistics from a pandas dataframe. It looks something like this:
id value conditional
1 10 0
2 20 0
3 30 1
1 15 1
3 5 0
1 10 1
So, I need to calculate the cumulative sum of the column value
for each id
from top to botom, but only when conditional
is 1.
So, this should give me something like:
id value conditional cumulative sum
1 10 0 0
2 20 0 0
3 30 1 30
1 15 1 15
3 5 0 30
1 10 1 25
So, the sum of id=1
is taken only when conditional=1
in the 4th and 6th row and the 1st row value is not counted. How do I do this in pandas?
You can create a Series that is the multiplication of value
and conditional
, and take the cumulative sum of it for each id group:
df['cumsum'] = (df['value']*df['conditional']).groupby(df['id']).cumsum()
df
Out:
id value conditional cumsum
0 1 10 0 0
1 2 20 0 0
2 3 30 1 30
3 1 15 1 15
4 3 5 0 30
5 1 10 1 25
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With