I have a data frame with 300 columns of data. I created a vector with 126 elements that are the column names of 126 of the 300. I want to subset the 300 based on not being in my 126. They are NOT in order, so I can't simply remove by specifying -1:-126.
I tried various things with grep and matrix operations, but they did not work. Such as the following which did not work. x has 300 columns. f contains vector of 126 column names I want to exclude from x1.
x1<-x[,-which(names(x), %in% f)]
If I definitively use a variable name or several, I can get it to work, but I don't want to type out the 126 elements in f.
The way you tell R that you want to select some particular elements (i.e., a 'subset') from a vector is by placing an 'index vector' in square brackets immediately following the name of the vector. For a simple example, try x[1:10] to view the first ten elements of x.
If we have a vector and a data frame, and the data frame has a column that contains the values similar as in the vector then we can create a subset of the data frame based on that vector. This can be done with the help of single square brackets and %in% operator.
Use %in%
:
names.use <- names(df)[!(names(df) %in% f)]
Then names.use
will contain the names of all the columns which are not contained in your vector of names f
.
To subset your data frame using the columns you want, you can use the following:
df.subset <- df[, names.use]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With