Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to evaluate a string to filter an R data.table?

Tags:

r

data.table

I was hoping for some help on passing a string of filter criteria into a data.table. I've tried all manners of parse and eval, and can't seem to figure it out

I tried to recreate an example using the iris dataset:

iris <- data.table(iris)

vars <- 'setosa'                    
filter <- 'Species == vars & Petal.Length >= 4'

data <- iris[filter, 
             list(sep.len.tot = sum(Sepal.Length), sep.width.total = sum(Sepal.Width)), 
             by = 'Species']

So the filter string has a vars variable within it (that changes based on a loop). I'm trying to filter the data based on the filter string.

Is there a data.table specific method of evaluating the string?

Hope that makes sense!

like image 499
AlexP Avatar asked Jun 15 '16 16:06

AlexP


2 Answers

I think eval(parse(text())) will work, you just need some modifications. Try this:

library(data.table)
iris <- data.table(iris)

#Updated so it will have quotes in your string
vars <- '\"setosa\"'  
#Update so you can change your vars
filter <- paste0('Species==',vars,'& Petal.Length >= 4')

res <- iris[eval(parse(text=filter)), list(
  sep.len.tot = sum(Sepal.Length)
  , sep.width.total = sum(Sepal.Width)
), by = 'Species']

A few notes: I updated your vars so there will be quotes in the string so it will run properly, and I also updated filter so you can dynamically change vars.

Finally, for explanatory purposes, the resulting df is blank (because no setosa species have Petal.Length >= 4. So in order to see this work, we can just remove the last condition.

filter <- paste0('Species==',vars)
res2 <- iris[eval(parse(text=filter)), list(
  sep.len.tot = sum(Sepal.Length)
  , sep.width.total = sum(Sepal.Width)
), by = 'Species']

res2
   Species sep.len.tot sep.width.total
1:  setosa       250.3           171.4

EDIT: Per @Frank's comment below, a cleaner approach is to write the whole thing as an expression:

filter <- substitute(Species == vars, list(vars = "setosa"))

res <- iris[eval(filter), list(
  sep.len.tot = sum(Sepal.Length)
  , sep.width.total = sum(Sepal.Width)
), by = 'Species']
like image 188
Mike H. Avatar answered Oct 28 '22 04:10

Mike H.


The most simple way I found:

treat_string_as_expr = rlang::parse_expr

grid = list(
  params = list(
    SMA_20 = c(20, 30, 50, 100),
    SMA_40 = c(30, 40, 50, 100, 200),
    slope = c(10, 20, 30),
    cons_ubw = c(2, 3, 5),
    cons_blw = c(2, 3, 5)
  ),
  filter = "SMA_20 > SMA_40 & cons_ubw == cons_blw"
)

expand.grid(grid$params) %>%
  dplyr::filter(!!treat_string_as_expr(grid$filter))
like image 29
Peter Trcka Avatar answered Oct 28 '22 03:10

Peter Trcka