Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Group data by time of the day

I have a dataframe with datetime index:df.head(6)

                          NUMBERES              PRICE    
DEAL_TIME
2015-03-02 12:40:03              5                 25   
2015-03-04 14:52:57              7                 23   
2015-03-03 08:10:09             10                 43   
2015-03-02 20:18:24              5                 37   
2015-03-05 07:50:55              4                 61   
2015-03-02 09:08:17              1                 17   

The dataframe includes the data of one week. Now I need to count the time period of the day. If time period is 1 hour, I know the following method would work:

df_grouped = df.groupby(df.index.hour).count()

But I don't know how to do when the time period is half hour. How can I realize it?

UPDATE:

I was told that this question is similar to How to group DataFrame by a period of time?

But I had tried the methods mentioned. Maybe it's my fault that I didn't say it clearly. 'DEAL_TIME' ranges from '2015-03-02 00:00:00' to '2015-03-08 23:59:59'. If I use pd.TimeGrouper(freq='30Min') or resample(), the time periods would range from '2015-03-02 00:30' to '2015-03-08 23:30'. But what I want is a series like below:

              COUNT      
DEAL_TIME
00:00:00         53 
00:30:00         49 
01:00:00         31
01:30:00         22
02:00:00          1
02:30:00         24
03:00:00         27
03:30:00         41
04:00:00         41
04:30:00         76
05:00:00         33
05:30:00         16
06:00:00         15
06:30:00          4
07:00:00         60
07:30:00         85
08:00:00          3
08:30:00         37
09:00:00         18
09:30:00         29
10:00:00         31
10:30:00         67
11:00:00         35
11:30:00         60
12:00:00         95
12:30:00         37
13:00:00         30
13:30:00         62
14:00:00         58
14:30:00         44
15:00:00         45
15:30:00         35
16:00:00         94
16:30:00         56
17:00:00         64
17:30:00         43
18:00:00         60
18:30:00         52
19:00:00         14
19:30:00          9
20:00:00         31
20:30:00         71
21:00:00         21
21:30:00         32
22:00:00         61
22:30:00         35
23:00:00         14
23:30:00         21

In other words, the time period should be irrelevant to the date.

like image 701
J Huang Avatar asked Dec 12 '25 13:12

J Huang


1 Answers

You need a 30-minute time grouper for this:

grouper = pd.TimeGrouper(freq="30T")

You also need to remove the 'date' part from the index:

df.index = df.reset_index()['index'].apply(lambda x: x - pd.Timestamp(x.date()))

Now, you can group by time alone:

df.groupby(grouper).count()

You can find somewhat obscure TimeGrouper documentation here: pandas resample documentation (it's actually resample documentation, but both features use the same rules).

like image 51
DYZ Avatar answered Dec 15 '25 14:12

DYZ



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!