I have a Pandas dataframe that looks like this:
and I want to grab for each distinct ID, the row with the max date so that my final results looks something like this:
My date column is of data type 'object'. I have tried grouping and then trying to grab the max like the following:
idx = df.groupby(['ID','Item'])['date'].transform(max) == df_Trans['date']
df_new = df[idx]
However I am unable to get the desired result.
idxmax
Should work so long as index
is unique or the maximal index isn't repeated.
df.loc[df.groupby('ID').date.idxmax()]
Should work as long as maximal values are unique. Otherwise, you'll get all rows equal to the maximum.
df[df.groupby('ID')['date'].transform('max') == df['date']]
And also very good solution.
df.sort_values(['ID', 'date']).drop_duplicates('date', keep='last')
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With