Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Create indicator variables of holidays from a date column

I am still a bonehead novice so forgive me if this is a simple question, but I can't find the answer on stackoverflow. I would like to create a set of indicator variables for each of the major US holidays, just by applying a function to my date field that can detect which days are holidays and then I could us Model.matrix etc.. to convert to a set of indicator variables.
For example, I have daily data from Jan 1 2012 through September 15th, 2013 and I would like to create a variable indicator for Easter.

I am currently using the timeDate package to pass a year to their function Easter() to find the date. I then type the dates into the following code to create an indicator variable.

Easter(2012)
EasterInd2012<-as.numeric(DATASET$Date=="2012-04-08")
like image 435
Reginald Roberts Avatar asked Oct 02 '13 13:10

Reginald Roberts


1 Answers

The easiest way to get a general holiday indicator variable is to create a vector of all the holidays you're interested in and then match those dates in your data frame. Something like this should work:

library(timeDate)

# Sample data
Date <- seq(as.Date("2012-01-01"), as.Date("2013-09-15"), by="1 day")
DATASET <- data.frame(rnorm(624), Date)

# Vector of holidays
holidays <- c(as.Date("2012-01-01"), 
              as.Date(Easter(2013)),
              as.Date("2012-12-25"),
              as.Date("2012-12-31"))

# 1 if holiday, 0 if not. Could also be a factor, like c("Yes", "No")
DATASET$holiday <- ifelse(DATASET$Date %in% holidays, 1, 0)

You can either manually input the dates, or use some of timeDate's built-in holiday functions (the listHolidays() function shows all those). So you could also construct holidays like so:

holidays <- c(as.Date("2012-01-01"), 
              as.Date(Easter(2013)),
              as.Date(USLaborDay(2012)),
              as.Date(USThanksgivingDay(2012)),
              as.Date(USMemorialDay(2012)),
              as.Date("2012-12-25"),
              as.Date("2012-12-31"))

To get specific indicators for each holiday, you'll need to do them one at a time:

EasterInd2012 <- ifelse(DATASET$Date==as.Date(Easter(2012)), 1, 0)
LaborDay2012 <- ifelse(DATASET$Date==as.Date(UsLaborDay(2012)), 1, 0)
# etc.
like image 147
Andrew Avatar answered Sep 30 '22 13:09

Andrew