Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Taking maximum value of only one column by "groupby" in pandas

Tags:

python

pandas

I have a dataframe with 10 columns:

id        date         value
1233     2014-10-3     1.123123
3412     2015-05-31    2.123123
3123     2015-05-31    5.6234234
3123     2013-03-21    5.6234222
3412     2014-11-21    4.776666
5121     2015-08-22    5.234234

I want to group by id column and take the latest date. But I don't want to take the maximum of value column. I want to take the value fo such row, that belongs to the maximum date.

pd.groupby('id').max() doesn't work. How can I solve it?

The most important thing, that I want to keep all columns in my dataset.

like image 661
yef_dan_92 Avatar asked Dec 08 '22 17:12

yef_dan_92


1 Answers

Or you can simply using sort_value then first

df.sort_values(['date', 'value'], ascending=[False, True]).groupby('id').first()

Out[480]: 
           date     value
id                       
1233 2014-10-03  1.123123
3123 2015-05-31  5.623423
3412 2015-05-31  2.123123
5121 2015-08-22  5.234234
like image 114
BENY Avatar answered Dec 10 '22 05:12

BENY