Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Combination of Join with Not Join in data.table?

Tags:

r

data.table

My question relates to R data.table with multiple keys. take this example:

library(data.table)
example(data.table)
key(DT)
[1] "x" "y"

and suppose I want a variation of "x not equal b and y not equal 3", as in here:

DT[!J("b",3)]
   x y  v v2  m
1: a 1 42 NA 42
2: a 3 42 NA 42
3: a 6 42 NA 42
4: b 1  4 84  5
5: b 6  6 84  5
6: c 1  7 NA  8
7: c 3  8 NA  8
8: c 6  9 NA  8

The variation I want is "x EQUAL b and y NOT equal 3", as in here:

DT[J("b",!3)]
Error in `[.data.table`(DT, J("b", !3)) : 
  typeof x.y (double) != typeof i.V2 (logical)

Any chance of telling J() to negate some keys? Thanks!

like image 604
Florian Oswald Avatar asked Feb 24 '13 18:02

Florian Oswald


1 Answers

For composite keys you can use the following

 DT[.("b")][!.(x, 3)]   # x is the name of first column of key

In general, you can chain together several [ ] [ ] to filter down to the results you need.



Note that you can also easily use logical statements in the i of data.table.
The J() -- or now .( ) -- syntax, is simply a shorthand convenience.

You can use almost anything that would go inside an if clause, with the advantage of accessing the column names as variables.

In your specific example, you would use x=="b" & y != 3 note the single &, not &&.

 DT[  x=="b" & y != 3]

You can also combine vector scans with the binary search of data.table as follows

 DT[.("b")][y != 3]
like image 176
Ricardo Saporta Avatar answered Sep 21 '22 01:09

Ricardo Saporta