Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Programmatic subsetting of a data.table in R

Tags:

r

data.table

This feels like a really simple question but its solution has eluded me for about 90 minutes of trying, searching and reading manuals and online.

Say I've got a data.table:

DT<-data.table(a=runif(n = 10),b=runif(n = 10),c=runif(n = 10))

Clearly something like this works:

DT[a > 0.5]

and gives me the subset of DT where the values in column "a" are greater than 0.5. But what if I want to be a bit more flexible (because the subset is embedded in a larger routine).

What I'd like to do is make this proto-function work:

flexSubset<-function(sColumnToSubset,dMin){
subs<-DT[sColumnToSubset>dMin]
return(subs)
}

I've tried without success, among many others...

with=FALSE

Any suggestions? Many thanks for your time in advance!

like image 750
russfx Avatar asked Apr 01 '15 16:04

russfx


1 Answers

If you want to pass a string, then do this:

flexSubset = function(sColumnToSubset, dMin)
                DT[get(sColumnToSubset) > dMin]

flexSubset("a", 0.5)

If you want to pass an unevaluated expression, then:

flexSubset = function(sColumnToSubset, dMin) {
                lhs = substitute(sColumnToSubset)
                DT[eval(lhs) > dMin]
             }

flexSubset(a, 0.5)
flexSubset(a / b, 0.5)
like image 169
eddi Avatar answered Nov 03 '22 06:11

eddi