Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Resample a pandas timeseries by "1st Monday of month" etc

Tags:

python

pandas

I have a (daily) pandas timeSeries that I need to filter down to monthly, using the nth occurrence of a particular weekday as the rule for selecting dates

My thinking so far is the best way to do this would be to first make a list or Series of all the dates that I'm interested in, and then asking the timeseries for those dates?

But that still leaves the question of how do I make a list of e.g. all "2nd Tuesday of the Month"s that have happened between two dates?

like image 314
Corvus Avatar asked Sep 30 '14 14:09

Corvus


2 Answers

Lets take for example September and October 2014:

from datetime import datetime
import pandas as pd

start = datetime(2014, 9, 1)
end = datetime(2014, 10, 30)
d = pd.date_range(start, end) # use bdate_range for business days

Now you can build a mask containing only the dates you are interested in:

>>> mask = (d.weekday == 1) & (7 < d.day) & (d.day < 15)
>>> d[mask]

<class 'pandas.tseries.index.DatetimeIndex'>
[2014-09-09, 2014-10-14]
Length: 2, Freq: None, Timezone: None
like image 160
elyase Avatar answered Sep 23 '22 01:09

elyase


Some dummy data:

In [44]: df = pd.DataFrame(data=range(1000), index=pd.date_range('1999-01-01', periods=1000), columns=['value'])

Let's assume you want the 2nd Tuesday of each month. You can resample to a particular day of the week, using a built-in pandas offset

In [45]: df = df.resample('W-TUE', how='last')

Then, you can calculate a week of month column, and use that to filter.

In [50]: df['wom'] = df.groupby(pd.TimeGrouper('M'))['value'].transform(lambda x: range(len(x))) + 1

In [53]: df[df['wom'] == 2].head()
Out[53]: 
            value  wom
1999-01-12     11    2
1999-02-09     39    2
1999-03-09     67    2
1999-04-13    102    2
1999-05-11    130    2
like image 28
chrisb Avatar answered Sep 23 '22 01:09

chrisb