I am having trouble using dplyr's tbl_df, respectively the regular data.frame. I got a big tbl_df (500x30K) and need to filter it. So what I would like to do is:
filter(my.tbl_df, row1>0, row10<0)
which would be similar to
df[df$row1>0 & df$row10<0,]
Works great. But I need to build the filter functions dynamically while running, so I need to access the DF/tbl_df columns by one or multiple variables. I tried something like:
var=c("row1","row10")
op=c(">","<")
val=c(0,0)
filter(my.tbl_df, eval(parse(text=paste(var,op,val,sep="")))
Which gives me an error: not compatible with LGLSXP This seems to be deeply rooted in the Cpp code.
I would be thankful for any hint. Also pointing out the "string to environment variable" conversion would be helpful, since I am pretty that I am doing it wrong.
With best,
Mario
This is related to this issue. In the meantime, one way could be to construct the whole expression, i.e.:
> my.tbl_df <- data.frame( row1 = -5:5, row10 = 5:-5)
> call <- parse( text = sprintf( "filter(my.tbl_df, %s)", paste(var,op,val, collapse="&") ) )
> call
expression(filter(my.tbl_df, row1 > 0&row10 < 0))
> eval( call )
row1 row10
1 1 -1
2 2 -2
3 3 -3
4 4 -4
5 5 -5
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With