Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

in pandas how can I groupby weekday() for a datetime column?

Tags:

python

pandas

I'd like to filter out weekend data and only look at data for weekdays (mon(0)-fri(4)). I'm new to pandas, what's the best way to accomplish this in pandas?

import datetime
from pandas import *

data = read_csv("data.csv")
data.my_dt 

Out[52]:
0     2012-10-01 02:00:39
1     2012-10-01 02:00:38
2     2012-10-01 02:01:05
3     2012-10-01 02:01:07
4     2012-10-01 02:02:03
5     2012-10-01 02:02:09
6     2012-10-01 02:02:03
7     2012-10-01 02:02:35
8     2012-10-01 02:02:33
9     2012-10-01 02:03:01
10    2012-10-01 02:08:53
11    2012-10-01 02:09:04
12    2012-10-01 02:09:09
13    2012-10-01 02:10:20
14    2012-10-01 02:10:45
...

I'd like to do something like:

weekdays_only = data[data.my_dt.weekday() < 5]

AttributeError: 'numpy.int64' object has no attribute 'weekday'

but this doesn't work, I haven't quite grasped how column datetime objects are accessed.

The eventual goal being to arrange hierarchically to weekday hour-range, something like:

monday, 0-6, 7-12, 13-18, 19-23
tuesday, 0-6, 7-12, 13-18, 19-23
like image 928
monkut Avatar asked Dec 06 '12 09:12

monkut


People also ask

How do I sort a datetime column in pandas?

sort_values(by=column_name) to sort pandas. DataFrame by the contents of a column named column_name . Before doing this, the data in the column must be converted to datetime if it is in another format using pandas. to_datetime(arg) with arg as the column of dates.

What is possible using groupby () method of pandas?

groupby() function is used to split the data into groups based on some criteria. pandas objects can be split on any of their axes. The abstract definition of grouping is to provide a mapping of labels to group names. sort : Sort group keys.

How do pandas deal with date time?

Pandas has a built-in function called to_datetime()that converts date and time in string format to a DateTime object. As you can see, the 'date' column in the DataFrame is currently of a string-type object. Thus, to_datetime() converts the column to a series of the appropriate datetime64 dtype.


1 Answers

your call to the function "weekday" does not work as it operates on the index of data.my_dt, which is an int64 array (this is where the error message comes from)

you could create a new column in data containing the weekdays using something like:

data['weekday'] = data['my_dt'].apply(lambda x: x.weekday())

then you can filter for weekdays with:

weekdays_only = data[data['weekday'] < 5 ]

I hope this helps

like image 138
Maximilian Avatar answered Sep 22 '22 03:09

Maximilian