Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Filter data.table on same condition for multiple columns

Tags:

r

data.table

I am using a vector of column names to select a subset of columns of a data.table. I had the idea if it's possible to basically define conditions in i which are then applied to all the selected columns. For example using the mtcars dataset. I would like to select the columns cylinder and gear and then would like to filter on all cars which have four cylinders and four gears. Of course I would also need to define if it is and or or for the filter, but I am just interested if the idea can be applied somehow in the data.table context.

# working code
sel.col <- c("cyl", "gear")
dt <- data.table(mtcars[1:4,])

dt[, ..sel.col]
dt[cyl == 4 & gear == 4, ..sel.col]    


# Non-working code
dt[ sel.col == 4 , ..sel.col]
like image 272
hannes101 Avatar asked Feb 06 '18 11:02

hannes101


People also ask

Can two columns be filtered independently?

If your Excel worksheet contains a lot of data then it can be hectic to find information quickly. To facilitate this problem, the Filter of Excel can be used to filter multiple columns independently.


1 Answers

We could use get

sel.col <- "cyl"
dt[get(sel.col) == 4, ..sel.col]
#    cyl gear
#1:   4    4

or eval(as.name

dt[eval(as.name(sel.col)) == 4, ..sel.col]
#    cyl gear
#1:   4    4

Earlier, we thought that there is only a single column to be evaluated. If we have more than one column, specify it in the .SDcols, loop through the Subset of Data.table (.SD) compare it with the value of interest ('4'), Reduce it to logical vector with | i.e. any TRUE in each of the rows and subset the rows based on this

dt[dt[, Reduce(`|`, lapply(.SD, `==`, 4)),.SDcols = sel.col], ..sel.col]
like image 198
akrun Avatar answered Sep 19 '22 15:09

akrun