Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I exclude columns from a data.table?

Tags:

r

data.table

I have a data.table, and want to exclude some set of columns. For example,

library(data.table)
dt <- data.table(a = 1:2, b = 2:3, c = 3:4, d = 4:5)
dt[ , .(b, c)]

Gives me the second and third column, b and c. How do I instead EXCLUDE columns b and c. Coming from the data.frame world, I would expect something like the following:

dt[ , -.(b, c)]

or, maybe

dt[ , !.(b, c)]

But neithr of these work. I know I can use

dt[ , -c(2:3), with = FALSE]

but this just (as I understand it) casts the data.table as a data.frame and then uses the standard operations. I would like to avoid this, since it is a) kind of cheating, an b) gives up the speed boosts available in data.table. I reviewed the data.table FAQ, and the vignette, and cannot seem to find anything.

(I know this is all very simplistic, and I could just select the other two columns. However, this is a microcosm of a much, MUCH bigger data.table I am working with.)

like image 967
lukehawk Avatar asked May 13 '16 12:05

lukehawk


People also ask

How do I remove a column from a data table?

Using the "Excel" actions, write the data table to a Microsoft Excel worksheet ("Write to Excel Worksheet" action). Use the "Delete Row/Column from Excel Worksheet" action to delete the desired column. Store the result into another data table variable with the "Read from Excel Worksheet" action.

How do I exclude a column from data in R?

The most easiest way to drop columns is by using subset() function. In the code below, we are telling R to drop variables x and z. The '-' sign indicates dropping variables. Make sure the variable names would NOT be specified in quotes when using subset() function.

How do I exclude data in Spotfire?

To add an Exclude Columns transformation to data that is already loaded into Spotfire: Select Data > Transform data.... In the Data table drop-down list, select the data table you want to add the transformation to. Select Exclude columns from the drop-down list and click Add....


5 Answers

Also, in case you would not wish to change the data.table, but merely return the columns except some columns, you can do:

dt[,.SD, .SDcols = !c('b', 'c')]

which returns the required result of:

   a d
1: 1 4
2: 2 5

while dt remains unchanged:

> dt
   a b c d
1: 1 2 3 4
2: 2 3 4 5
like image 76
ira Avatar answered Oct 12 '22 00:10

ira


We can use setdiff

dt[, setdiff(names(dt), c("b", "c")), with = FALSE]

or we can assign to NULL (as in the other answer) but in a single step

dt[, c("b", "c") := NULL][]
like image 37
akrun Avatar answered Oct 11 '22 23:10

akrun


You can do:

  dt[ , b := NULL][ , c := NULL]

or you can use a list of columns to be removed:

xx <- c("b","c") # vector of columns you DON'T want

# subset
  dt <- dt[, !xx, with = FALSE]
like image 35
rafa.pereira Avatar answered Oct 11 '22 23:10

rafa.pereira


you can always just do:

dt[ , -c("b", "c")]

although this uses the data.fame sintax and as the problems you describe, particularly it seems to be much slower on large data sets.

like image 40
cach dies Avatar answered Oct 12 '22 01:10

cach dies


Another way using set:

set(dt,, c("b", "c"), NULL)
like image 30
Deb Avatar answered Oct 12 '22 01:10

Deb