Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R: Drop columns from data.table, by reference, without having the name

Tags:

r

data.table

This is almost a duplicate of this. I want to drop columns from a data table, but I want to do it efficiently. I have a list of names of columns that I want to keep. All the answers to the linked question imply doing something akin to

data.table.new <- data.table.old[, my.list]

which at some crucial point will give me a new object, while the old object is still in memory. However, my data.table.old is huge, and hence I prefer to do this via reference, as suggested here

set(data.table.old, j = 'a', value = NULL)

However, as I have a whitelist of columns, and not a blacklist, I would need to iterate through all the column names, checks whether they are in my.list, and then apply set(). Is there any cleaner/other way to doing so?

like image 539
FooBar Avatar asked Sep 20 '25 09:09

FooBar


1 Answers

Not sure if you can do by reference ops on data.frame without making it data.table.
Below code should works if you consider to use data.table.

library(data.table)
setDT(data.frame.old)
dropcols <- names(data.frame.old)[!names(data.frame.old) %in% my.list]
data.frame.old[, c(dropcols) := NULL]
like image 116
jangorecki Avatar answered Sep 21 '25 23:09

jangorecki