Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Get longest streak of consecutive weeks by group in pandas

Currently I'm working with weekly data for different subjects, but it might have some long streaks without data, so, what I want to do, is to just keep the longest streak of consecutive weeks for every id. My data looks like this:

id    week
1      8
1      15
1      60
1      61
1      62
2      10
2      11
2      12
2      13
2      25
2      26

My expected output would be:

id    week
1      60
1      61
1      62
2      10
2      11
2      12
2      13

I got a bit close, trying to mark with a 1 when week==week.shift()+1. The problem is this approach doesn't mark the first occurrence in a streak, and also I can't filter the longest one:

df.loc[ (df['id'] == df['id'].shift())&(df['week'] == df['week'].shift()+1),'streak']=1

This, according to my example, would bring this:

id    week  streak
1      8     nan
1      15    nan
1      60    nan
1      61    1
1      62    1
2      10    nan
2      11    1
2      12    1
2      13    1
2      25    nan
2      26    1

Any ideas on how to achieve what I want?

like image 684
Juan C Avatar asked Jan 27 '23 17:01

Juan C


1 Answers

Try this:

df['consec'] = df.groupby(['id',df['week'].diff(-1).ne(-1).shift().bfill().cumsum()]).transform('count')

df[df.groupby('id')['consec'].transform('max') == df.consec]

Output:

   id  week  consec
2   1    60       3
3   1    61       3
4   1    62       3
5   2    10       4
6   2    11       4
7   2    12       4
8   2    13       4
like image 173
Scott Boston Avatar answered Feb 02 '23 01:02

Scott Boston