Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

data.table join with roll = “nearest” returns "search value" instead of original value

Tags:

r

data.table

I got a problem with the binary search function J() and roll = "nearest".

Let's say I got this example data.table "dt"

Key  Value1  Value2
20   4       5
12   2       1
55   10      7

I do a search with roll = "nearest":

dt[J(15), roll = "nearest"]

...which returns:

Key  Value1  Value2
15   2       1

Thus, the correct row is returned. However, the original "Key" value (12) is replaced by the value used in the search (15).

My question is that a normal behaviour and can one change this auto override?

EDIT:

Reproducible Example (Note I use version 1.9.7):

library("data.table")
dt <- data.table(c(20,12,55), c(4,2,10), c(5,1,7))
dt
#   V1 V2 V3
#1: 20  4  5
#2: 12  2  1
#3: 55 10  7
setkey(dt, V1)
dt[J(15), roll = "nearest"]
#   V1 V2 V3
#1: 15  2  1
like image 296
Lennie Avatar asked Aug 26 '16 07:08

Lennie


1 Answers

You probably need data.table in 1.9.7 to make x.V1 work. Then you can refer to column from x dataset explicitly. This is required because columns used in join are taken from the second dataset i, as it is in base R.

library("data.table")
dt <- data.table(c(20,12,55), c(4,2,10), c(5,1,7))
setkey(dt, V1)
dt[J(15), .(V1=x.V1, V2, V3), roll = "nearest"]
#   V1 V2 V3
#1: 12  2  1

As you mention you already have 1.9.7, for others who doesn't have see Installation wiki.

like image 96
jangorecki Avatar answered Dec 09 '22 17:12

jangorecki