Python Dataframe select rows based on max values in one of the columns

Question

I have a dataframe in python (many rows, 2 columns). I want to modify the DF with a unique value in column 1 based on the largest value in column 2 (column 2 is sorted in ascending order if that helps). I could probably write a loop but would prefer a one or two line solution. Thanks.

Ex.

ID         Value
100       11
100       14
100       16
200       10
200       20
200       30
300       45
400        0
400       25

desired result

100       16
200       30
300       45
400       25

Ex.

ID         Value
100       11
100       14
100       16
200       10
200       20
200       30
300       45
400        0
400       25

desired result

100       16
200       30
300       45
400       25

EdChum · Accepted Answer

You want to groupby on 'a' column and then get the index of the max value using idxmax and use these indices to index the orig df:

In [12]:
df.loc[df.groupby('a')['b'].idxmax()]

Out[12]:
     a   b
2  100  16
5  200  30
6  300  45
8  400  25

pansen · Answer

In case you don't need the original index but just the highest values per ID, you can use groupby and max:

print(df.groupby("ID").max())

     Value
ID  
100     16
200     30
300     45
400     25

Python Dataframe select rows based on max values in one of the columns

Tags:

python

pandas

jim g

2 Answers

EdChum

pansen

Recent Activity

Donate For Us

Python Dataframe select rows based on max values in one of the columns

Tags:

python

pandas

jim g

2 Answers

EdChum

pansen

Related questions

Recent Activity

Donate For Us