Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

pandas dataframe find last value lower than certain value in the column within group

Tags:

pandas

Have a pandas dataframe,need to find last value lower than certain value in the column within group.

     dataframe is as following:
     region    year   month      signal
     A         2010   2          20
     A         2010   3          32
     A         2010   4          24
     A         2010   6          50
     .......
     A         2011   1          20
     .......
     B         2010   2          100
     B         2010   3           20
     .......

Now I need add one column of flag to label the last value lower than 25 in the first half year for each ["region","year"] combination. So for example for region A in the first half year of 2010, April signal value will be labeled as 1. Any suggestion for this? Thanks.

like image 850
newleaf Avatar asked Dec 05 '25 05:12

newleaf


1 Answers

First filter rows by boolean indexing and conditions for less by Series.lt, then use DataFrame.duplicated for last values of combinations region and year. Assign to column with Series.reindex for replace misisng values to False and last map True->1 and False->0 by convert mask to integers

m = ~df.loc[df['month'].lt(7) & df['signal'].lt(25)].duplicated(['region','year'], keep='last')

df['new'] = m.reindex(df.index, fill_value=False).astype(int)
print (df)
  region  year  month  signal  new
0      A  2010      2      20    0
1      A  2010      3      32    0
2      A  2010      4      24    1
3      A  2010      6      50    0
4      A  2011      1      20    1
5      B  2010      2     100    0
6      B  2010      3      20    1
like image 190
jezrael Avatar answered Dec 09 '25 01:12

jezrael



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!