Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

pandas groupby date select earliest per day

Tags:

python

pandas

I have the following dataset:

            value            timestamp
0            Fire  2017-10-03 14:33:52
1           Water  2017-10-04 14:33:48
2            Fire  2017-10-04 14:33:45
3            Fire  2017-10-05 14:33:30
4           Water  2017-10-03 14:33:40
5           Water  2017-10-05 14:32:13
6           Water  2017-10-04 14:32:01
7            Fire  2017-10-03 14:31:55

I want to group this set by timestamp per day and then only select the earliest row per day. For the above example the following should be the result:

            value            timestamp
1           Water  2017-10-05 14:32:13
2           Water  2017-10-04 14:32:01
3            Fire  2017-10-03 14:31:55

For example, for the day 2017-10-03 there are 3 entries but I only want the earliest on that day.

like image 943
wasp256 Avatar asked Sep 20 '25 09:09

wasp256


1 Answers

If you have unique index, you can use idxmin on timestamp to find out the indices of the minimum timestamp and extract them with loc:

df.timestamp = pd.to_datetime(df.timestamp)
df.loc[df.groupby(df.timestamp.dt.date, as_index=False).timestamp.idxmin()]

#   value             timestamp
#7   Fire   2017-10-03 14:31:55
#6  Water   2017-10-04 14:32:01
#5  Water   2017-10-05 14:32:13
like image 97
Psidom Avatar answered Sep 22 '25 23:09

Psidom