I have a data frame with weekly data by Section. Each Section has approx 104 weeks worth of data and there is 83 sections in total.
I have a second data frame with the Start and End week by Section that I want to filter the main data frame on.
In both tables the Week is a combination of Year and Week e.g. 201501 and is always from weeks 1 to 52.
So in the example below I want to filter Section A by weeks 201401 to 201404, Section B by weeks 201551 to 201603.
I initially thought I could add an additional column to my Weeks_Filter data frame that is a sequential number from the start and end of the the weeks for each section (duplicating each row for each week), then merge the 2 tables and keep all the data from the Weeks_Filter table (all.y = TRUE) because this worked on a small sample I did but I don't know how to add the sequential weeks since they can span different years.
Week <- c("201401","201402","201403","201404","201405", "201451", "201552", "201601", "201602", "201603")
Section <- c(rep("A",5),rep("B",5))
df <- data.frame(cbind(Week, Section))
Section <- c("A", "B")
Start <- c("201401","201551")
End <- c("201404","201603")
Weeks_Filter <- data.frame(cbind(Section, Start, End))
The latest development version of data.table
adds non-equi joins (and in the older ones you can use foverlaps
):
setDT(df) # convert to data.table in place
setDT(Weeks_Filter)
# fix the column types - you have factors currently, converting to integer
df[, Week := as.integer(as.character(Week))]
Weeks_Filter[, `:=`(Start = as.integer(as.character(Start)),
End = as.integer(as.character(End)))]
# the actual magic
df[df[Weeks_Filter, on = .(Section, Week >= Start, Week <= End), which = T]]
# Week Section
#1: 201401 A
#2: 201402 A
#3: 201403 A
#4: 201404 A
#5: 201552 B
#6: 201601 B
#7: 201602 B
#8: 201603 B
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With