Logo Questions Linux Laravel Mysql Ubuntu Git Menu

Select Subset of Columns based on Vector R



I have a data frame with 300 columns of data. I created a vector with 126 elements that are the column names of 126 of the 300. I want to subset the 300 based on not being in my 126. They are NOT in order, so I can't simply remove by specifying -1:-126.

I tried various things with grep and matrix operations, but they did not work. Such as the following which did not work. x has 300 columns. f contains vector of 126 column names I want to exclude from x1.

x1<-x[,-which(names(x), %in% f)] 

If I definitively use a variable name or several, I can get it to work, but I don't want to type out the 126 elements in f.

like image 249
akaDrHouse Avatar asked May 06 '16 12:05


People also ask

How do you subset data based on a vector in R?

The way you tell R that you want to select some particular elements (i.e., a 'subset') from a vector is by placing an 'index vector' in square brackets immediately following the name of the vector. For a simple example, try x[1:10] to view the first ten elements of x.

How do you subset a Dataframe from a vector?

If we have a vector and a data frame, and the data frame has a column that contains the values similar as in the vector then we can create a subset of the data frame based on that vector. This can be done with the help of single square brackets and %in% operator.

1 Answers

Use %in%:

names.use <- names(df)[!(names(df) %in% f)] 

Then names.use will contain the names of all the columns which are not contained in your vector of names f.

To subset your data frame using the columns you want, you can use the following:

df.subset <- df[, names.use] 
like image 125
Tim Biegeleisen Avatar answered Sep 20 '22 15:09

Tim Biegeleisen