Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Merge/Join Data Frame / Table based on criteria - > or <

I have a data frame with weekly data by Section. Each Section has approx 104 weeks worth of data and there is 83 sections in total.

I have a second data frame with the Start and End week by Section that I want to filter the main data frame on.

In both tables the Week is a combination of Year and Week e.g. 201501 and is always from weeks 1 to 52.

So in the example below I want to filter Section A by weeks 201401 to 201404, Section B by weeks 201551 to 201603.

I initially thought I could add an additional column to my Weeks_Filter data frame that is a sequential number from the start and end of the the weeks for each section (duplicating each row for each week), then merge the 2 tables and keep all the data from the Weeks_Filter table (all.y = TRUE) because this worked on a small sample I did but I don't know how to add the sequential weeks since they can span different years.

Week <- c("201401","201402","201403","201404","201405", "201451", "201552", "201601", "201602", "201603")
Section <- c(rep("A",5),rep("B",5))
df <- data.frame(cbind(Week, Section))

Section <- c("A", "B")
Start <- c("201401","201551")
End <- c("201404","201603")
Weeks_Filter <- data.frame(cbind(Section, Start, End))
like image 536
MidnightDataGeek Avatar asked Oct 19 '22 12:10

MidnightDataGeek


1 Answers

The latest development version of data.table adds non-equi joins (and in the older ones you can use foverlaps):

setDT(df) # convert to data.table in place
setDT(Weeks_Filter)

# fix the column types - you have factors currently, converting to integer
df[, Week := as.integer(as.character(Week))]
Weeks_Filter[, `:=`(Start = as.integer(as.character(Start)),
                    End   = as.integer(as.character(End)))]

# the actual magic
df[df[Weeks_Filter, on = .(Section, Week >= Start, Week <= End), which = T]]
#     Week Section
#1: 201401       A
#2: 201402       A
#3: 201403       A
#4: 201404       A
#5: 201552       B
#6: 201601       B
#7: 201602       B
#8: 201603       B
like image 139
eddi Avatar answered Oct 21 '22 02:10

eddi