Consider the dataframe
df = pd.DataFrame(
[
['A', 1],
['A', 1],
['B', 1],
['B', 0],
['A', 0],
['A', 1],
['B', 1]
], columns = ['key', 'cond'])
I want to find a cumulative (running) count (starting at 1) for each key
, where we only increment if the previous value in the group had cond == 1
. When appended to the above dataframe this would give
df_result = pd.DataFrame(
[
['A', 1, 1],
['A', 1, 2],
['B', 1, 1],
['B', 0, 2],
['A', 0, 3],
['A', 1, 3],
['B', 1, 2]
], columns = ['key', 'cond'])
Note that essentially the cond
values of the last rows in each key
group have no effect.
Just doing a simple group
and cumcount
df.groupby('key').cumcount()
of course doesn't account for the cond
value of the previous element. How can I take this into account?
EDIT
As some of the solutions below don't work on some edge cases, I will give a more comprehensive dataframe for testing.
df = pd.DataFrame(
[
['A', 0],
['A', 1],
['A', 1],
['B', 1],
['B', 0],
['A', 0],
['A', 1],
['B', 1],
['B', 0]
], columns = ['key', 'cond'])
which when appending the true result should give
df_result = pd.DataFrame(
[
['A', 0, 1],
['A', 1, 1],
['A', 1, 2],
['B', 1, 1],
['B', 0, 2],
['A', 0, 3],
['A', 1, 3],
['B', 1, 2],
['B', 0, 3]
], columns = ['key', 'cond'])
Use groupby
with combination of shift
and cumsum
.
df['new'] = df.groupby('key').cond.apply(
lambda x: x.shift().fillna(1).cumsum()
).astype(int)
df
key cond new
0 A 1 1
1 A 1 2
2 B 1 1
3 B 0 2
4 A 0 3
5 A 1 3
6 B 1 2
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With