Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Remove every column but some of them in data.table [duplicate]

Tags:

r

data.table

I don't know if it's already asked, because it seems like it should be a common question, but I haven't been able to find anything about it even though I've tried. Sorry in that case.

Given dt <- data.table(col1 = c(1, 2, 3, 4), col2 = c("a", "b", "c", "d"), col3 = c(T, F, T, F)):

  • You can select multiple columns using dt[, c("col1", "col2")]
  • You can select every column, except col1 and col 2, using dt[, -c("col1", "col2")]
  • You can delete a column using dt[, "col1" := NULL]
  • You can delete multiple columns using dt[, c("col1", "col2") := NULL]
  • You can't delete every column, except col1, using dt[, -"col1" := NULL]
  • Neither you can delete every column, except col1 and col2, using dt[, -c("col1", "col2") := NULL]

I'm pretty sure it has to be any way to achieve the last two, but for me it's not possible at the moment. Could you please give me some advice? I'm not new to programming, and I know a little bit of R (not my strongest, though), but I'm fairly new to data.table.

Thanks everyone.

EDIT: This question has an answer in the following link, although the topic doesn't address this question so is hard to find if you are looking for this specifical doubt:

How do I subset column variables in DF1 based on the important variables I got in DF2?

like image 658
sneaky_lobster Avatar asked Mar 05 '23 22:03

sneaky_lobster


1 Answers

One option is setdiff to assign columns that are not wanted to NULL for removal from the original dataset

dt[, setdiff(names(dt), "col1") := NULL][] 
like image 123
akrun Avatar answered Mar 25 '23 07:03

akrun