Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas Dataframe get maximum with respect to other entries [duplicate]

I have a Dataframe like this:

name phase value
BOB 1 .9
BOB 2 .05
BOB 3 .05
JOHN 2 .45
JOHN 3 .45
JOHN 4 .05
FRANK 1 .4
FRANK 3 .6

I want to find which entry in column 'phase' has the maximum value in column 'value'.
If more than one share the same maximum value keep the first or a random value for 'phase'.


Desired result table:

name phase value
BOB 1 .9
JOHN 2 .45
FRANK 3 .6

my approach was:

df.groupby(['name'])[['phase','value']].max() 

but it returned incorrect values.

like image 937
Tomáš Ulrich Avatar asked Sep 15 '25 22:09

Tomáš Ulrich


1 Answers

You don't need to use groupby. Sort values by value and phase (adjust the order if necessary) and drop duplicates by name:

out = (df.sort_values(['value', 'phase'], ascending=[False, True])
         .drop_duplicates('name')
         .sort_index(ignore_index=True))
print(out)

# Output
    name  phase  value
0    BOB      1   0.90
1   JOHN      2   0.45
2  FRANK      3   0.60
like image 191
Corralien Avatar answered Sep 19 '25 07:09

Corralien



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!