This is almost a duplicate of this. I want to drop columns from a data table, but I want to do it efficiently. I have a list of names of columns that I want to keep. All the answers to the linked question imply doing something akin to
data.table.new <- data.table.old[, my.list]
which at some crucial point will give me a new object, while the old object is still in memory. However, my data.table.old
is huge, and hence I prefer to do this via reference, as suggested here
set(data.table.old, j = 'a', value = NULL)
However, as I have a whitelist of columns, and not a blacklist, I would need to iterate through all the column names, checks whether they are in my.list
, and then apply set()
. Is there any cleaner/other way to doing so?
Not sure if you can do by reference ops on data.frame without making it data.table.
Below code should works if you consider to use data.table.
library(data.table)
setDT(data.frame.old)
dropcols <- names(data.frame.old)[!names(data.frame.old) %in% my.list]
data.frame.old[, c(dropcols) := NULL]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With