Grouping Pandas DataFrame by date

Tags:

I have a Pandas DataFrame that includes a date column. Elements of that column are of type pandas.tslib.Timestamp.

I'd like to group the dataframe by date, but exclude timestamp information that is more granular that date (ie. grouping by date, where all Feb 23, 2011 are grouped). I know how to express this in SQL, but am quite new to Pandas.

This question does something very similar, but I don't understand the code and it uses datetime objects.

From the documentation, I don't even understand how to retrieve the date from a Pandas Timestamp object. I could convert to datetime object, but that seems very roundabout.

As requested, the output of df.head():

    date    show    network timed   session_id
0   2011-12-03 02:48:52  Monk    TV38    670     00003DA9-01D2-E7A9-4177-203BE6A9E2BA    
1   2011-12-03 03:00:09  WBZ News    TV38    205     00003DA9-01D2-E7A9-4177-203BE6A9E2BA
2   2011-12-03 03:04:04  Dateline NBC    NBC     30  00003DA9-01D2-E7A9-4177-203BE6A9E2BA
3   2011-12-03 03:04:35  20/20   ABC     25  00003DA9-01D2-E7A9-4177-203BE6A9E2BA
4   2011-12-03 03:04:56  College Football    FOX     55  00003DA9-01D2-E7A9-4177-203BE6A9E2BA

931

asked Nov 21 '13 16:11

David Chouinard

2 Answers

You can use the normalize DatetimeIndex method (which takes it to midnight that day):

In [11]: df['date']
Out[11]: 
0   2011-12-03 02:48:52
1   2011-12-03 03:00:09
2   2011-12-03 03:04:04
3   2011-12-03 03:04:35
4   2011-12-03 03:04:56
Name: date, dtype: datetime64[ns]

In [12]: pd.DatetimeIndex(df['date']).normalize()
Out[12]: 
<class 'pandas.tseries.index.DatetimeIndex'>
[2011-12-03 00:00:00, ..., 2011-12-03 00:00:00]
Length: 5, Freq: None, Timezone: None

And you can groupby this:

g = df.groupby(pd.DatetimeIndex(df['date']).normalize())

In 0.15 you'll have access to the dt attribute, so can write this as:

g = df.groupby(df['date'].dt.normalize())

180

answered Oct 16 '22 12:10

Andy Hayden

It's not clear if you are trying to groupby and aggregate (like in SQL) or create an index with a date instead of a timestamp.

If you're trying to groupby and aggregrate, you could do it this way:

df.groupby(df.set_index('date').index.date).mean()

Timeseries indexes have datetime properites like day, date, etc. That will aggregate the timed column since it's the only numeric column.

If you're trying to create an index with a date level, you could do something like:

import datetime
df.set_index(['date', df.date.apply(lambda x: datetime.datetime.date(x))], inplace=True)
df.index.names = ['timestamp', 'daydate']

This will give you a multiindex with the timestamp and a date . If you don't want the index to be permanent, drop the inplace= argument.

answered Oct 16 '22 10:10

Patrick Russell

Related questions
                            
                                Java 8 epoch-millis time stamp to formatted date, how?
                            
                                Order of Java 8's datetime package DayOfWeek enum [closed]
                            
                                Printing correct time using timezones, Python
                            
                                Using PHP DateInterval to create recurring events
                            
                                Get timestamps of current week
                            
                                How to create a datetime object from separate date fields?
                            
                                Plotting data against time in R
                            
                                nth weekday calculation in Python - whats wrong with this code?
                            
                                elegant way of convert a numpy array containing datetime.timedelta into seconds in python 2.7
                            
                                Get week of month C# [duplicate]
                            
                                Conversion from UNIX time to timestamp starting in January 1, 2000
                            
                                Convert Date from ISO 8601 Zulu string to java.time.Instant in Java 8
                            
                                Print Date and Time In Visual Studio C++ build?
                            
                                Saving Dates in SQLServer
                            
                                Parsing dates to a Rails friendly format
                            
                                How can I get this week's dates in Perl?
                            
                                PHP MySQL Query most popular in last 24 hours
                            
                                What is the recommended way of doing date arithmetics in Perl?
                            
                                What is the difference between RFC1123 and RFC2822 DateTime formatting in PHP?
                            
                                Javascript Date getTime() code snippet with mysterious additional characters

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Grouping Pandas DataFrame by date

Tags:

datetime

pandas

group-by

python-2.7

David Chouinard

People also ask

2 Answers

Andy Hayden

Patrick Russell

Recent Activity

Donate For Us