Here is an example of what I mean which makes things clear:
require(data.table)
x = data.table(a=1:10, idx=sample(c(TRUE, FALSE), 10, replace=TRUE))
x[idx]
Error in eval(expr, envir, enclos) : object 'idx' not found
However, the following works:
x[idx[]]
#a idx
#1: 2 TRUE
#2: 5 TRUE
#3: 7 TRUE
#4: 9 TRUE
#5: 10 TRUE
Any idea what's happening here?
Quoting from the link provided in the comments by @GSee.
Hi, Yes expected. From ?data.table: "Advanced: When i is a single variable name, it is not considered an expression of column names and is instead evaluated in calling scope." Subsetting by a logical column is the only example I can think of where this is confusing. But we make use of this feature quite a lot e.g. TMP=list(...);DT[TMP] safe in the knowledge that DT[TMP] won't start to fail if DT in future has a column called TMP. When I have a logical column boolCol I wrap with (): DT[(boolCol)].
This avoids the memory allocation and scan of ==TRUE, and avoids the variable name repetition of DT[DT$boolCol] Matthew
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With