Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas: Rounding to nearest Hour

I have a column with timestamps

 start_time: 
 0    2016-06-04 05:18:49
 1    2016-06-04 06:50:12
 2    2016-06-04 08:16:02
 3    2016-06-04 15:05:13
 4    2016-06-04 15:24:25

I want use a function on the start_time column to round minutes >= 30 to the next hour.

 def extract_time(col):
      time = col.strftime('%H:%M')
      min= int(time.strip(':')[1])
      hour= int(time.strip(':')[0])
      if min >= 30:
           return hour + 1
      return hour

Then I want to create a new columns 'hour', with the rounded hours:

 df['hour'] = df['start_time'].apply(extract_time)

Instead of getting getting an 'hour' column with the rounded hours, I am getting the below:

 0    <function extract_hour at 0x128722b90>
 1    <function extract_hour at 0x128722b90>
 2    <function extract_hour at 0x128722b90>
 3    <function extract_hour at 0x128722b90>
 4    <function extract_hour at 0x128722b90>
like image 561
EJSuh Avatar asked Mar 29 '18 17:03

EJSuh


1 Answers

you can use the following vectorized solution:

In [30]: df['hour'] = df['start_time'].dt.round('H').dt.hour

In [31]: df
Out[31]:
           start_time  hour
0 2016-06-04 05:18:49     5
1 2016-06-04 06:50:12     7
2 2016-06-04 08:16:02     8
3 2016-06-04 15:05:13    15
4 2016-06-04 15:24:25    15
like image 182
MaxU - stop WAR against UA Avatar answered Nov 06 '22 07:11

MaxU - stop WAR against UA