Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R data.table J behavior

Tags:

r

data.table

I am still puzzled by the behavior of data.table J.

> DT = data.table(A=7:3,B=letters[5:1])
> DT
   A B
1: 7 e
2: 6 d
3: 5 c
4: 4 b
5: 3 a
> setkey(DT, A, B)

> DT[J(7,"e")]
   A B
1: 7 e

> DT[J(7,"f")]
   A B
1: 7 f  # <- there is no such line in DT

but there is no such line in DT. Why do we get this result?

like image 939
Timothée HENRY Avatar asked May 20 '14 08:05

Timothée HENRY


1 Answers

The data.table J(7, 'f') is literally a single-row data.table that you are joining your own data.table with. When you call x[i], you are looking at each row in i and finding all matches for this in x. The default is to give NA for rows in i that don't match anything, which is easier seen by adding another column to DT:

DT <- data.table(A=7:3,B=letters[5:1],C=letters[1:5])
setkey(DT, A, B)
DT[J(7,"f")]
#    A B  C
# 1: 7 f NA

What you are seeing is the only row in J with no match to anything in DT. To prevent data.table from reporting non-matches, you can use nomatch=0

DT[J(7,"f"), nomatch=0]
# Empty data.table (0 rows) of 3 cols: A,B,C
like image 122
MattLBeck Avatar answered Oct 22 '22 12:10

MattLBeck