Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Get the first week in a row of consecutive weeks in DataFrame

I have been working trying to get the first week in a row of consecutive weeks for an specific id in a DataFrame like this:

week     id
 1        1
 2        1
 3        1
 6        1
 7        1
 8        1
 3        2
 4        2

So for id 1, there's three consecutive weeks (1-3), then again three consecutive weeks (6-8), and for id 2, there's 2 (3-4). So, my desired output would be:

week     id   first_week
 1        1       1
 2        1       1
 3        1       1
 6        1       6
 7        1       6
 8        1       6
 3        2       3
 4        2       3

I was able to count the number of consecutive weeks with this:

df.groupby(['id',df['week'].diff(-1).ne(-1).shift().bfill().cumsum()]).transform('count')

But I'm not being able to find the first week (I want to construct a new id that adds in which week the streak of consecutive weeks started)

like image 385
Juan C Avatar asked Jan 26 '23 12:01

Juan C


1 Answers

groupby get the diff , and condition assign , then just ffill the nan

df.loc[df.groupby('id').week.diff().ne(1),'fw']=df.week
df.fw=df.fw.ffill()
df
Out[158]: 
   week  id   fw
0     1   1  1.0
1     2   1  1.0
2     3   1  1.0
3     6   1  6.0
4     7   1  6.0
5     8   1  6.0
6     3   2  3.0
7     4   2  3.0
like image 58
BENY Avatar answered Jan 29 '23 01:01

BENY