Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Select subset of columns in data.table R [duplicate]

Tags:

r

data.table

I have a data table with a bunch of columns, e.g.:

dt<-data.table(matrix(runif(10*10),10,10)) 

I want to perform some operation on the data table, such as producing a correlation matrix (cor(dt)). In order to do this, I want to remove a few columns that contain non-numeric values or values outside a certain range.

Let's say I want to find the correlation matrix excluding V1, V2, V3 and V5.

Here is my current approach:

cols<-!(colnames(dt)=="V1" | colnames(dt)=="V2" | colnames(dt)=="V3" | colnames(dt)=="V5") new_dt<-subset(dt,,cols) cor(new_dt) 

I find this pretty cumbersome, considering data.table syntax is usually so elegant. Is there a better method of doing this?

like image 295
Jeff Avatar asked Jan 22 '15 17:01

Jeff


People also ask

How do I select multiple columns in a table in R?

To pick out single or multiple columns use the select() function. The select() function expects a dataframe as it's first input ('argument', in R language), followed by the names of the columns you want to extract with a comma between each name.


2 Answers

Use with=FALSE:

cols = paste("V", c(1,2,3,5), sep="")  dt[, !cols, with=FALSE] 

I suggest going through the "Introduction to data.table" vignette.


Update: From v1.10.2 onwards, you can also do:

dt[, ..cols] 

See the first NEWS item under v1.10.2 here for additional explanation.

like image 163
Arun Avatar answered Sep 22 '22 16:09

Arun


You can do

dt[, !c("V1","V2","V3","V5")] 

to get

            V4         V6         V7        V8         V9        V10  1: 0.88612076 0.94727825 0.50502208 0.6702523 0.24186706 0.96263313  2: 0.11121752 0.13969145 0.19092645 0.9589867 0.27968190 0.07796870  3: 0.50179822 0.10641301 0.08540322 0.3297847 0.03643195 0.18082180  4: 0.09787517 0.07312777 0.88077548 0.3218041 0.75826099 0.55847774  5: 0.73475574 0.96644484 0.58261312 0.9921499 0.78962675 0.04976212  6: 0.88861117 0.85690337 0.27723130 0.3662264 0.50881663 0.67402625  7: 0.33933983 0.83392047 0.30701697 0.6138122 0.85107176 0.58609504  8: 0.89907094 0.61389815 0.19957386 0.3968331 0.78876682 0.90546328  9: 0.54136123 0.08274569 0.25190790 0.1920462 0.15142604 0.12134807 10: 0.36511064 0.88117171 0.05730210 0.9441072 0.40125023 0.62828674 
like image 42
MrFlick Avatar answered Sep 19 '22 16:09

MrFlick