Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

r %in% operator behavior for data table factors?

Tags:

r

data.table

I can't seem to get the %in% operator to behave for data table factor columns. I know I'm probably missing some secret syntax for data tables, but I can't to find it... I've searched all over.

Here's a tiny example illustrating my pain. Of course the simple answer would be to use data frames, but I have a large data set that benefits from some features of data tables.

> a <- data.table(c1=factor(c(1,2,3)))
> a
   c1
1:  1
2:  2
3:  3

> '2' %in% a[,1,with=F]
[1] FALSE

> 2 %in% a[,1,with=F]
[1] FALSE

and it works like I expect for data frames...

> b <- data.frame(c1=factor(c(1,2,3)))
> '2' %in% b[,1]
[1] TRUE

Any help appreciated....

like image 893
nsymms Avatar asked Sep 21 '25 05:09

nsymms


1 Answers

a[,1,with=F] is a data.table and not a vector like b[,1]. This is documented.

A data.table is a list and help("%in%") says that "lists are converted to character vectors". So, I'd guess this happens (but it's hidden in the C source code of match):

as.character(a[,1,with=F])
#[1] "1:3"

You can select data.table columns efficiently with [[:

'2' %in% a[[1]]
#[1] TRUE
like image 73
Roland Avatar answered Sep 22 '25 19:09

Roland