Grab rows with max date from pandas dataframe

Question

I have a Pandas dataframe that looks like this:

enter image description here

and I want to grab for each distinct ID, the row with the max date so that my final results looks something like this:

enter image description here

My date column is of data type 'object'. I have tried grouping and then trying to grab the max like the following:

idx = df.groupby(['ID','Item'])['date'].transform(max) == df_Trans['date']
df_new = df[idx]

However I am unable to get the desired result.

piRSquared · Accepted Answer

Should work so long as index is unique or the maximal index isn't repeated.

df.loc[df.groupby('ID').date.idxmax()]

Should work as long as maximal values are unique. Otherwise, you'll get all rows equal to the maximum.

df[df.groupby('ID')['date'].transform('max') == df['date']]

And also very good solution.

df.sort_values(['ID', 'date']).drop_duplicates('date', keep='last')

Donate For Us