Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Join tables based on multiple ranges in R

I have a situation where I have two data frames I would like to join. The table params describes the param for a unit in terms time and angle ranges. The table data is longer and contains id, time and angle parameters.

I would like to join the param value from params when id match and time is in the range between valid_from and valid_to and ang is between angle_begin angle_end in the data table.

Below is an example of the tables.

params <- data.frame(id = 1:4
                    ,valid_from  = 1
                    ,valid_to    = c(10, 20, 30, 40)
                    ,angle_begin = c(120, 90, 0, 50)
                    ,angle_end   = c(180, 170, 160, 150)
                    ,param       = c("A", "B", "C", "D"))

data <- data.frame(id = rep(1:4, each=100)
                  ,time = rep(seq(from = 0.5, to = 50, by = 0.5), 4)
                  ,ang  = rep(runif(100, 0, 360), 4))
like image 380
Kresten Avatar asked Dec 03 '22 21:12

Kresten


1 Answers

With data.table this is a non-equi join:

library(data.table)
# coerce to data.table
setDT(params)
setDT(data)

# keep only rows of data with matches in params
data[params, 
     on = .(id, time >= valid_from, time <= valid_to, ang >= angle_begin, ang <= angle_end),
     .(id, time = x.time, ang = x.ang, param)]
    id time        ang param
 1:  1  2.0 140.383052     A
 2:  1  3.5 152.772925     A
 3:  1  8.0 141.039548     A
 4:  2  1.0 104.434264     B
 5:  2  2.0 140.383052     B
 6:  2  3.5 152.772925     B
 7:  2  8.0 141.039548     B
 8:  2 16.0 150.424306     B
 9:  2 16.5  92.201187     B
10:  ...
41:  4 22.0  89.813795     D
42:  4 22.5 131.004229     D
43:  4 26.0  79.839443     D
44:  4 27.5 128.291356     D
45:  4 29.0 127.942287     D
46:  4 30.0 136.388594     D
47:  4 32.0 140.092817     D
48:  4 32.5 108.346831     D
49:  4 37.0 140.732844     D
    id time        ang param

If all rows of data should be kept

params[data, 
       on = .(id, valid_from <= time, valid_to >= time, angle_begin <= ang, angle_end >= ang), 
       .(id, time = i.time, ang = i.ang, param)]
     id time       ang param
  1:  1  0.5 106.62639    NA
  2:  1  1.0 104.43426    NA
  3:  1  1.5  15.77429    NA
  4:  1  2.0 140.38305     A
  5:  1  2.5 322.31929    NA
 ---                        
396:  4 48.0 131.17405    NA
397:  4 48.5 335.47857    NA
398:  4 49.0 181.64450    NA
399:  4 49.5  90.96224    NA
400:  4 50.0  60.04268    NA
like image 120
Uwe Avatar answered Dec 06 '22 10:12

Uwe