Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to select columns programmatically in a data.table?

Tags:

I have the following data.table (DT):

DT <- data.table(V1 = 1:3, V2 = 4:6, V3 = 7:9) 

I would like to select a subset of the variables programmatically (dynamically), by using an object where the relevant variable names are stored. For example, I want to select the two columns "V1" and "V3" stored in a variable "keep"

keep <- c("V1", "V3") 

If we were to select the "keep" columns from a data.frame, the following would work:

DT[keep] 

Unfortunately, this is not working when this is a data.table. I thought the data.frame and data.table are identical with this kind of behavior, but apperently they aren't. Anybody able to advise on the correct syntax?

like image 947
Jochem Avatar asked Apr 25 '13 11:04

Jochem


People also ask

How do I select multiple columns of data in R?

To pick out single or multiple columns use the select() function. The select() function expects a dataframe as it's first input ('argument', in R language), followed by the names of the columns you want to extract with a comma between each name.

How do I select a column in a table in R?

To select a column in R you can use brackets e.g., YourDataFrame['Column'] will take the column named “Column”. Furthermore, we can also use dplyr and the select() function to get columns by name or index. For instance, select(YourDataFrame, c('A', 'B') will take the columns named “A” and “B” from the dataframe.


1 Answers

This is covered in FAQ 1.1, 1.2 and 2.17.

Some possibilities:

DT[, keep, with = FALSE] DT[, c('V1', 'V3'), with = FALSE] DT[, c(1, 3), with = FALSE] DT[, list(V1, V3)] 

The reason DF[c('V1','V3')] works as it does for a data.frame is covered in ?`[.data.frame`

Data frames can be indexed in several modes. When [ and [[ are used with a single vector index (x[i] or x[[i]]), they index the data frame as if it were a list. In this usage a drop argument is ignored, with a warning.


From data.table 1.10.2, you may use the .. prefix when subsetting columns programmatically:

When j is a symbol prefixed with .. it will be looked up in calling scope and its value taken to be column names or numbers [...] It is experimental.

Thus:

DT[ , ..keep] #    V1 V3 # 1:  1  7 # 2:  2  8 # 3:  3  9 
like image 58
mnel Avatar answered Oct 25 '22 16:10

mnel