Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas: Checking if a date is a holiday and assigning boolean value

Tags:

python

pandas

I have a pandas data frame with date column, and I am trying to add a new column of boolean values indicating whether a given date is a holiday or not.

Following is the code, but it does not work (all the values are False) because the types seem to be different, and I can't figure out how to get the 'date' in the pandas data frame to be of the same type as the holidays:

cal = USFederalHolidayCalendar() holidays = cal.holidays(start=train_df['date'].min(),                         end=train_df['date'].max()).to_pydatetime() train_df['holiday'] = train_df['date'].isin(holidays) print type(train_df['date'][1]) print type(holidays[0]) 
like image 652
Gopala Avatar asked Apr 17 '15 01:04

Gopala


2 Answers

You don't need to convert anything. Just compare straight up. pandas is smart enough to compare a lot of different types with regards to dates and times. You have to have a slightly more esoteric format if you're having issues with date/time compatibility.

import pandas as pd from pandas.tseries.holiday import USFederalHolidayCalendar as calendar  dr = pd.date_range(start='2015-07-01', end='2015-07-31') df = pd.DataFrame() df['Date'] = dr  cal = calendar() holidays = cal.holidays(start=dr.min(), end=dr.max())  df['Holiday'] = df['Date'].isin(holidays) print df 

Result:

         Date Holiday 0  2015-07-01   False 1  2015-07-02   False 2  2015-07-03    True 3  2015-07-04   False 4  2015-07-05   False 5  2015-07-06   False 6  2015-07-07   False 7  2015-07-08   False 8  2015-07-09   False 9  2015-07-10   False 10 2015-07-11   False 11 2015-07-12   False 12 2015-07-13   False 13 2015-07-14   False 14 2015-07-15   False 15 2015-07-16   False 16 2015-07-17   False 17 2015-07-18   False 18 2015-07-19   False 19 2015-07-20   False 20 2015-07-21   False 21 2015-07-22   False 22 2015-07-23   False 23 2015-07-24   False 24 2015-07-25   False 25 2015-07-26   False 26 2015-07-27   False 27 2015-07-28   False 28 2015-07-29   False 29 2015-07-30   False 30 2015-07-31   False 

Note that July 4, 2015 falls on a Saturday.

like image 54
NullDev Avatar answered Sep 18 '22 23:09

NullDev


I had the same problem as the author, and the other fix provided didn't work for me. This is what did work:

train_df['holiday'] = train_df['date'].dt.date.astype('datetime64').isin(holidays) 
like image 35
Michael Hartman Avatar answered Sep 20 '22 23:09

Michael Hartman