I have been working trying to get the first week in a row of consecutive weeks for an specific id in a DataFrame like this:
week id
1 1
2 1
3 1
6 1
7 1
8 1
3 2
4 2
So for id 1, there's three consecutive weeks (1-3), then again three consecutive weeks (6-8), and for id 2, there's 2 (3-4). So, my desired output would be:
week id first_week
1 1 1
2 1 1
3 1 1
6 1 6
7 1 6
8 1 6
3 2 3
4 2 3
I was able to count the number of consecutive weeks with this:
df.groupby(['id',df['week'].diff(-1).ne(-1).shift().bfill().cumsum()]).transform('count')
But I'm not being able to find the first week (I want to construct a new id that adds in which week the streak of consecutive weeks started)
groupby
get the diff
, and condition assign , then just ffill
the nan
df.loc[df.groupby('id').week.diff().ne(1),'fw']=df.week
df.fw=df.fw.ffill()
df
Out[158]:
week id fw
0 1 1 1.0
1 2 1 1.0
2 3 1 1.0
3 6 1 6.0
4 7 1 6.0
5 8 1 6.0
6 3 2 3.0
7 4 2 3.0
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With