Let's say we have the following pandas DataFrame:
In [1]:
import pandas as pd
import numpy as np
df = pd.DataFrame([0, 1, 0, 0, 1, 1, 0, 1, 1, 1], columns=['in'])
df
Out[1]:
in
0 0
1 1
2 0
3 0
4 1
5 1
6 0
7 1
8 1
9 1
How to count the number of consecutive ones in a vectorized way in pandas? I would like to have a result like this:
in out
0 0 0
1 1 1
2 0 0
3 0 0
4 1 1
5 1 2
6 0 0
7 1 1
8 1 2
9 1 3
Something like a vectorized cumsum operation that resets on a specific condition.
You can do something like this(credit goes to: how to emulate itertools.groupby with a series/dataframe?):
>>> df['in'].groupby((df['in'] != df['in'].shift()).cumsum()).cumsum()
0 0
1 1
2 0
3 0
4 1
5 2
6 0
7 1
8 2
9 3
dtype: int64
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With