Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Selecting columns in R data frame based on those *not* in a vector

I'm familiar with being able to extract columns from an R data frame (or matrix) like so:

df.2 <- df[, c("name1", "name2", "name3")] 

But can one use a ! or other tool to select all but those listed columns?

For background, I have a data frame with quite a few column vectors and I'd like to avoid:

  • Typing out the majority of the names when I could just remove a minority
  • Using the much shorter df.2 <- df[, c(1,3,5)] because when my .csv file changes, my code goes to heck since the numbering isn't the same anymore. I'm new to R and think I've learned the hard way not to use number vectors for larger df's that might change.

I tried:

df.2 <- df[, !c("name1", "name2", "name3")] df.2 <- df[, !=c("name1", "name2", "name3")] 

And just as I was typing this, found out that this works:

df.2 <- df[, !names(df) %in% c("name1", "name2", "name3")] 

Is there a better way than this last one?

like image 761
Hendy Avatar asked Aug 31 '12 02:08

Hendy


People also ask

How do I select specific columns from a Dataframe in R?

To select a column in R you can use brackets e.g., YourDataFrame['Column'] will take the column named “Column”. Furthermore, we can also use dplyr and the select() function to get columns by name or index. For instance, select(YourDataFrame, c('A', 'B') will take the columns named “A” and “B” from the dataframe.

How do I select multiple columns from a Dataframe in R?

To pick out single or multiple columns use the select() function. The select() function expects a dataframe as it's first input ('argument', in R language), followed by the names of the columns you want to extract with a comma between each name.


Video Answer


2 Answers

An alternative to grep is which:

df.2 <- df[, -which(names(df) %in% c("name1", "name2", "name3"))] 
like image 92
harkmug Avatar answered Sep 24 '22 16:09

harkmug


You can make a shorter call that is also more generalizable with negative-grep:

df.2 <- df[, -grep("^name[1:3]$", names(df) )]  

Since grep returns numerics you can use the negative vector indexing to remove columns. You could add further number or more complex patterns.

like image 26
IRTFM Avatar answered Sep 20 '22 16:09

IRTFM