For a ton of dates, I need to compute the next business day, where I account for holidays.
Currently, I'm using something like the code below, which I've pasted from IPython notebook:
import pandas as pd
from pandas.tseries.holiday import USFederalHolidayCalendar
cal = USFederalHolidayCalendar()
bday_offset = lambda n: pd.datetools.offsets.CustomBusinessDay(n, calendar=cal)
mydate = pd.to_datetime("12/24/2014")
%timeit with_holiday = mydate + bday_offset(1)
%timeit without_holiday = mydate + pd.datetools.offsets.BDay(1)
On my computer, the with_holiday line runs in ~12 milliseconds; and the without_holiday line runs in ~15 microseconds.
Is there any way to make the bday_offset function faster?
For pandas. date_range the days returned seem to simply only be the weekdays, i.e. Monday through Friday, which include any holidays. If you want to exclude holidays in your python version you can use any of the existing calendar classes or create a custom one yourself in combination with pandas.
Pandas has a built-in function called to_datetime()that converts date and time in string format to a DateTime object. As you can see, the 'date' column in the DataFrame is currently of a string-type object. Thus, to_datetime() converts the column to a series of the appropriate datetime64 dtype.
Dateoffsets are a standard kind of date increment used for a date range in Pandas. It works exactly like relativedelta in terms of the keyword args we pass in.
I think the way you are implementing it via lambda is slowing it down. Consider this method (taken more or less straight from the documentaion )
from pandas.tseries.offsets import CustomBusinessDay
bday_us = CustomBusinessDay(calendar=USFederalHolidayCalendar())
mydate + bday_us
Out[13]: Timestamp('2014-12-26 00:00:00')
The first part is slow, but you only need to do it once. The second part is very fast though.
%timeit bday_us = CustomBusinessDay(calendar=USFederalHolidayCalendar())
10 loops, best of 3: 66.5 ms per loop
%timeit mydate + bday_us
10000 loops, best of 3: 44 µs per loop
To get apples to apples, here are the other timings on my machine:
%timeit with_holiday = mydate + bday_offset(1)
10 loops, best of 3: 23.1 ms per loop
%timeit without_holiday = mydate + pd.datetools.offsets.BDay(1)
10000 loops, best of 3: 36.6 µs per loop
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With