I am using a vector of column names to select a subset of columns of a data.table. I had the idea if it's possible to basically define conditions in i
which are then applied to all the selected columns.
For example using the mtcars
dataset.
I would like to select the columns cylinder and gear and then would like to filter on all cars which have four cylinders and four gears. Of course I would also need to define if it is and
or or
for the filter, but I am just interested if the idea can be applied somehow in the data.table
context.
# working code
sel.col <- c("cyl", "gear")
dt <- data.table(mtcars[1:4,])
dt[, ..sel.col]
dt[cyl == 4 & gear == 4, ..sel.col]
# Non-working code
dt[ sel.col == 4 , ..sel.col]
If your Excel worksheet contains a lot of data then it can be hectic to find information quickly. To facilitate this problem, the Filter of Excel can be used to filter multiple columns independently.
We could use get
sel.col <- "cyl"
dt[get(sel.col) == 4, ..sel.col]
# cyl gear
#1: 4 4
or eval(as.name
dt[eval(as.name(sel.col)) == 4, ..sel.col]
# cyl gear
#1: 4 4
Earlier, we thought that there is only a single column to be evaluated. If we have more than one column, specify it in the .SDcols
, loop through the Subset of Data.table (.SD
) compare it with the value of interest ('4'), Reduce
it to logical vector with |
i.e. any TRUE in each of the rows and subset the rows based on this
dt[dt[, Reduce(`|`, lapply(.SD, `==`, 4)),.SDcols = sel.col], ..sel.col]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With