Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas: Return Hour from Datetime Column Directly

Assume I have a DataFrame sales of timestamp values:

timestamp               sales_office 2014-01-01 09:01:00     Cincinnati 2014-01-01 09:11:00     San Francisco 2014-01-01 15:22:00     Chicago 2014-01-01 19:01:00     Chicago 

I would like to create a new column time_hour. I can create it by writing a short function as so and using apply() to apply it iteratively:

def hr_func(ts):     return ts.hour  sales['time_hour'] = sales['timestamp'].apply(hr_func) 

I would then see this result:

timestamp               sales_office         time_hour 2014-01-01 09:01:00     Cincinnati           9 2014-01-01 09:11:00     San Francisco        9 2014-01-01 15:22:00     Chicago              15 2014-01-01 19:01:00     Chicago              19 

What I'd like to achieve is some shorter transformation like this (which I know is erroneous but gets at the spirit):

sales['time_hour'] = sales['timestamp'].hour 

Obviously the column is of type Series and as such doesn't have those attributes, but it seems there's a simpler way to make use of matrix operations.

Is there a more-direct approach?

like image 882
Daniel Black Avatar asked Aug 04 '14 23:08

Daniel Black


People also ask

How do I convert time to hours in pandas?

Let's see how to extract the hour from a timestamp in Pandas, with the help of multiple examples. Example 1 : pandas. timestamp. now() takes timezone as input and returns current timestamp object of that timezone.

How do I extract time from a DataFrame in Python?

arg: It can be integer, float, tuple, Series, Dataframe to convert into datetime as its datatype. format: This will be str, but the default is None. The strftime to parse time, eg “%d/%m/%Y”, note that “%f” will parse all the way up to nanoseconds.


1 Answers

Assuming timestamp is the index of the data frame, you can just do the following:

hours = sales.index.hour 

If you want to add that to your sales data frame, just do:

import pandas as pd pd.concat([sales, pd.DataFrame(hours, index=sales.index)], axis = 1) 

Edit: If you have several columns of datetime objects, it's the same process. If you have a column ['date'] in your data frame, and assuming that 'date' has datetime values, you can access the hour from the 'date' as:

hours = sales['date'].hour 

Edit2: If you want to adjust a column in your data frame you have to include dt:

sales['datehour'] = sales['date'].dt.hour  
like image 115
Sudipta Basak Avatar answered Sep 20 '22 17:09

Sudipta Basak