I have a dataframe and list of columns in that dataframe that I'd like to drop. Let's use the iris
dataset as an example. I'd like to drop Sepal.Length
and Sepal.Width
and use only the remaining columns. How do I do this using select
or select_
from the dplyr
package?
Here's what I've tried so far:
drop.cols <- c('Sepal.Length', 'Sepal.Width')
iris %>% select(-drop.cols)
Error in -drop.cols : invalid argument to unary operator
iris %>% select_(.dots = -drop.cols)
Error in -drop.cols : invalid argument to unary operator
iris %>% select(!drop.cols)
Error in !drop.cols : invalid argument type
iris %>% select_(.dots = !drop.cols)
Error in !drop.cols : invalid argument type
I feel like I'm missing something obvious because these seems like a pretty useful operation that should already exist. On Github, someone posted a similar issue, and Hadley said to use 'negative indexing'. That's what (I think) I've tried, but to no avail. Any suggestions?
Use dplyr to Drop Multiple Columns Using a Function in R As usual, to drop columns, we use the ! operator. In the example, we use a simple custom function to select all columns with more than 10. The code drops these and returns the remaining columns.
We can delete multiple columns in the R dataframe by assigning null values through the list() function.
How do I Delete a Column in Dplyr. Deleting a column using dplyr is very easy using the select() function and the - sign. For example, if you want to remove the columns “X” and “Y” you'd do like this: select(Your_Dataframe, -c(X, Y)) .
In order to drop the column which ends with certain label we will be using select() function along with ends_with() function by passing the column label inside the ends_with() function as shown below. Dropping the column name which ends with “cyl” is accomplished using ends_with() function and select() function.
Check the help on select_vars. That gives you some extra ideas on how to work with this.
In your case:
iris %>% select(-one_of(drop.cols))
also try
## Notice the lack of quotes
iris %>% select (-c(Sepal.Length, Sepal.Width))
Beyond select(-one_of(drop.cols))
there are a couple other options for dropping columns using select()
that do not involve defining all the specific column names (using the dplyr starwars sample data for some more variety in column names):
starwars %>%
select(-(name:mass)) %>% # the range of columns from 'name' to 'mass'
select(-contains('color')) %>% # any column name that contains 'color'
select(-starts_with('bi')) %>% # any column name that starts with 'bi'
select(-ends_with('er')) %>% # any column name that ends with 'er'
select(-matches('^f.+s$')) %>% # any column name matching the regex pattern
select_if(~!is.list(.)) %>% # not by column name but by data type
head(2)
# A tibble: 2 x 2
homeworld species
<chr> <chr>
1 Tatooine Human
2 Tatooine Droid
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With