Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python Pandas: Group datetime column into hour and minute aggregations

This seems like it would be fairly straight forward but after nearly an entire day I have not found the solution. I've loaded my dataframe with read_csv and easily parsed, combined and indexed a date and a time column into one column but now I want to be able to just reshape and perform calculations based on hour and minute groupings similar to what you can do in excel pivot.

I know how to resample to hour or minute but it maintains the date portion associated with each hour/minute whereas I want to aggregate the data set ONLY to hour and minute similar to grouping in excel pivots and selecting "hour" and "minute" but not selecting anything else.

Any help would be greatly appreciated.

like image 455
horatio1701d Avatar asked Apr 28 '13 18:04

horatio1701d


3 Answers

Can't you do, where df is your DataFrame:

times = pd.to_datetime(df.timestamp_col)
df.groupby([times.dt.hour, times.dt.minute]).value_col.sum()
like image 169
Wes McKinney Avatar answered Nov 11 '22 19:11

Wes McKinney


Wes' code didn't work for me. But the DatetimeIndex function (docs) did:

times = pd.DatetimeIndex(data.datetime_col)
grouped = df.groupby([times.hour, times.minute])

The DatetimeIndex object is a representation of times in pandas. The first line creates a array of the datetimes. The second line uses this array to get the hour and minute data for all of the rows, allowing the data to be grouped (docs) by these values.

like image 23
Nix G-D Avatar answered Nov 11 '22 20:11

Nix G-D


Came across this when I was searching for this type of groupby. Wes' code above didn't work for me, not sure if it's because changes in pandas over time.

In pandas 0.16.2, what I did in the end was:

grp = data.groupby(by=[data.datetime_col.map(lambda x : (x.hour, x.minute))])
grp.count()

You'd have (hour, minute) tuples as the grouped index. If you want multi-index:

grp = data.groupby(by=[data.datetime_col.map(lambda x : x.hour),
                       data.datetime_col.map(lambda x : x.minute)])
like image 21
WillZ Avatar answered Nov 11 '22 19:11

WillZ