Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Subset data.table by logical column

I have a data.table with a logical column. Why the name of the logical column can not be used directly for the i argument? See the example.

dt <- data.table(x = c(T, T, F, T), y = 1:4)  # Works dt[dt$x] dt[!dt$x]  # Works dt[x == T] dt[x == F]  # Does not work dt[x] dt[!x] 
like image 718
djhurio Avatar asked Apr 24 '13 11:04

djhurio


People also ask

How do you subset a Dataframe based on columns in R?

The most general way to subset a data frame by rows and/or columns is the base R Extract[] function, indicated by matched square brackets instead of the usual matched parentheses. For a data frame named d the general format is d[rows, columms] .

How do you subset a Dataframe based on a vector in R?

If we have a vector and a data frame, and the data frame has a column that contains the values similar as in the vector then we can create a subset of the data frame based on that vector. This can be done with the help of single square brackets and %in% operator.


2 Answers

From ?data.table

Advanced: When i is a single variable name, it is not considered an expression of column names and is instead evaluated in calling scope.

So dt[x] will try to evaluate x in the calling scope (in this case the global environment)

You can get around this by using ( or { or force

dt[(x)] dt[{x}] dt[force(x)] 
like image 87
mnel Avatar answered Sep 20 '22 17:09

mnel


x is not defined in the global environment. If you try this,

> with(dt, dt[x])       x y 1: TRUE 1 2: TRUE 2 3: TRUE 4 

It would work. Or this:

> attach(dt) > dt[!x]        x y 1: FALSE 3 

EDIT:

according to the documentation the j parameter takes column name, in fact:

> dt[x] Error in eval(expr, envir, enclos) : object 'x' not found > dt[j = x] [1]  TRUE  TRUE FALSE  TRUE 

then, the i parameter takes either numerical or logical expression (like x itself should be), however it seems it (data.table) can't see x as logical without this:

> dt[i = x] Error in eval(expr, envir, enclos) : object 'x' not found > dt[i = as.logical(x)]       x y 1: TRUE 1 2: TRUE 2 3: TRUE 4 
like image 24
Michele Avatar answered Sep 20 '22 17:09

Michele