I have a dataframe that looks like this:
Reference |   ID  | Length
ref101    |123456 | 10
ref101    |123789 | 5
ref202    |654321 | 20
ref202    |653212 | 40
I'm trying to determine which row for each row in the Reference column has the greatest length (based on the value in the Length column). For example, ref101 with ID 123456 is greater in length than ref101 with ID 123789. 
I've been playing around with .groupby(), but am getting nowhere. Is there a way of performing this sort of operation in Pandas?
If it's the whole row you want, then use groupby + idxmax:
df.loc[df.groupby('Reference').Length.idxmax()]
  Reference      ID  Length
0    ref101  123456      10
3    ref202  653212      40
If you want just the length, then groupby + max will suffice:
df.groupby('Reference').Length.max()
Reference
ref101    10
ref202    40
Name: Length, dtype: int64
                        If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With