Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to get the max value of a multiple column group-by pandas?

Iam trying to get the row with maximum value based on another column of a groupby, I am trying to follow the solutions given here Python : Getting the Row which has the max value in groups using groupby, however it doesn't work when you apply

annotations.groupby(['bookid','conceptid'], sort=False)['weight'].max()

I get

bookid    conceptid
12345678  3942     0.137271
          10673    0.172345
          1002     0.125136
34567819  44407    1.370921
          5111     0.104729
          6160     0.114766
          200      0.151629
          3504     0.152793

But I'd like to get only the row with the highest weight, e.g.,

bookid    conceptid
12345678  10673    0.172345
34567819  44407    1.370921

I'd appreciate any help

like image 963
ssierral Avatar asked Dec 26 '22 02:12

ssierral


1 Answers

If you need the bookid and conceptid for the maximum weight, try this

annotations.ix[annotations.groupby(['bookid'], sort=False)['weight'].idxmax()][['bookid', 'conceptid', 'weight']]

Note: Since Pandas v0.20 ix has been deprecated. Use .loc instead.

like image 138
user1827356 Avatar answered Dec 28 '22 06:12

user1827356