Pandas: conditional rolling count

Question

I have a Series that looks the following:

   col
0  B
1  B
2  A
3  A
4  A
5  B

It's a time series, therefore the index is ordered by time.

For each row, I'd like to count how many times the value has appeared consecutively, i.e.:

Output:

   col count
0  B   1
1  B   2
2  A   1 # Value does not match previous row => reset counter to 1
3  A   2
4  A   3
5  B   1 # Value does not match previous row => reset counter to 1

I found 2 related questions, but I can't figure out how to "write" that information as a new column in the DataFrame, for each row (as above). Using rolling_apply does not work well.

Finding consecutive segments in a pandas data frame

P.Tillmann · Accepted Answer

I think there is a nice way to combine the solution of @chrisb and @CodeShaman (As it was pointed out CodeShamans solution counts total and not consecutive values).

  df['count'] = df.groupby((df['col'] != df['col'].shift(1)).cumsum()).cumcount()+1

  col  count
0   B      1
1   B      2
2   A      1
3   A      2
4   A      3
5   B      1

CodeShaman · Answer

One-liner:

df['count'] = df.groupby('col').cumcount()

or

df['count'] = df.groupby('col').cumcount() + 1

if you want the counts to begin at 1.

chrisb · Answer

Based on the second answer you linked, assuming s is your series.

df = pd.DataFrame(s)
df['block'] = (df['col'] != df['col'].shift(1)).astype(int).cumsum()
df['count'] = df.groupby('block').transform(lambda x: range(1, len(x) + 1))


In [88]: df
Out[88]: 
  col  block  count
0   B      1      1
1   B      1      2
2   A      2      1
3   A      2      2
4   A      2      3
5   B      3      1

Pandas: conditional rolling count

Tags:

python

pandas

justinlevol

3 Answers

P.Tillmann

CodeShaman

chrisb

Recent Activity

Donate For Us

Pandas: conditional rolling count

Tags:

python

pandas

justinlevol

3 Answers

P.Tillmann

CodeShaman

chrisb

Related questions

Recent Activity

Donate For Us