Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R data.table roll="nearest" not actually nearest

Tags:

r

data.table

Given the following data.tables I'm surprised to see the 5.9 index matching with 5 rather than 6.

I don't quite understand what's going on.

dat <- data.table(index = c(4.3, 5.9, 1.2), datval = runif(3)+10, 
datstuff="test")
reference <- data.table(index = 1:10, refjunk = "junk", refval = runif(10))

dat[, dat_index := index]
reference[dat, roll="nearest", on="index"]

I would expect to see 3 rows with the index==6 row in reference being matched with the index==5.9 row in dat, at least for my understanding on nearest.

Is this the expected behaviour?

Using R 3.3.2, data.table 1.10.4

like image 640
BetaScoo8 Avatar asked May 24 '17 22:05

BetaScoo8


1 Answers

Because 1:10 is a vector of integers, the join in done on integers and as.integer(5.9) is 5.

You can, use 1:10+0 to build a numeric:

reference <- data.table(index = 1:10+0, ref_index=1:10, refjunk = "junk", refval = runif(10))
reference[dat, roll="nearest", on="index"]

   index ref_index refjunk     refval   datval datstuff dat_index
1:   4.3         4    junk 0.09868848 10.37403     test       4.3
2:   5.9         6    junk 0.60545607 10.86906     test       5.9
3:   1.2         1    junk 0.50005336 10.07994     test       1.2
like image 174
HubertL Avatar answered Nov 01 '22 05:11

HubertL