Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R dplyr drop column that may or may not exist select(-name)

Tags:

select

r

dplyr

library(ggplot2)
library(dplyr)

diamonds <- diamonds %>% select(-clarity)

# this works fine
# but doing it again gives me an error
diamonds %>% select(-clarity)

Error in is_character(x) : object 'clarity' not found

How do I do a safe drop/deselect?

like image 329
16 revs, 12 users 31% Avatar asked Nov 19 '19 13:11

16 revs, 12 users 31%


People also ask

How do I remove a column from a specific name in R?

Method 1: Using subset() This is one of the easiest approaches to drop columns is by using the subset() function with the '-' sign which indicates dropping variables. This function in R Language is used to create subsets of a Data frame and can also be used to drop columns from a data frame.

How do I drop a column using dplyr?

In order to drop the column which ends with certain label we will be using select() function along with ends_with() function by passing the column label inside the ends_with() function as shown below. Dropping the column name which ends with “cyl” is accomplished using ends_with() function and select() function.

How do I remove the last column in R?

Deleting a column using dplyr is very easy using the select() function and the - sign. For example, if you want to remove the columns “X” and “Y” you'd do like this: select(Your_Dataframe, -c(X, Y)) .

How to drop column in R using dplyr?

Drop column in R using Dplyr: Drop column in R can be done by using minus before the select function.

How to select or drop the columns based on criteria in R?

Dplyr package in R is provided with select () function which is used to select or drop the columns based on conditions like starts with, ends with, contains and matches certain criteria and also dropping column based on position, Regular expression, criteria like column names with missing values has been depicted with an example for each.

What is DropDrop column with column name in R?

Drop column with column name in R dplyr. Drop column which contains a value or matches a pattern. Drop column which starts with or ends with certain character.

How do I delete a column from a Dataframe in dplyr?

Deleting a column using dplyr is very easy using the select () function and the - sign. For example, if you want to remove the columns “X” and “Y” you’d do like this: select (Your_Dataframe, -c (X, Y)).


3 Answers

You can do:

diamonds %>% 
 select(-one_of("clarity"))

If there is a non-existing variable:

diamonds %>% 
 select(-one_of("clarity", "clearness"))

it returns a warning:

Warning message:
Unknown columns: `clearness` 

From dplyr 1.0.0, any_of() could be used:

diamonds %>% 
 select(-any_of(c("clarity", "clearness")))
like image 184
tmfmnk Avatar answered Nov 09 '22 19:11

tmfmnk


Here's a slight twist using dplyr::select_if() that will not throw an Unknown columns: warning if the column name does not exist, in this case 'bad_column':

diamonds %>% 
  select_if(!names(.) %in% c('carat', 'cut', 'bad_column'))
like image 20
sbha Avatar answered Nov 09 '22 19:11

sbha


Here's a simple modification to the one_of method shown by tmfmnk to work with symbols like select. The input is converted to quosures then to character.

library(tidyverse) # or just dplyr and purrr

drop_cols <- function(df, ...){
  df %>% 
    select(-one_of(map_chr(enquos(...), quo_name)))
}

diamonds %>% 
  drop_cols(clarity, color, zebra)

# # A tibble: 53,940 x 8
#    carat cut       depth table price     x     y     z
#    <dbl> <ord>     <dbl> <dbl> <int> <dbl> <dbl> <dbl>
#  1 0.23  Ideal      61.5    55   326  3.95  3.98  2.43
#  2 0.21  Premium    59.8    61   326  3.89  3.84  2.31
#  3 0.23  Good       56.9    65   327  4.05  4.07  2.31
#  4 0.290 Premium    62.4    58   334  4.2   4.23  2.63
#  5 0.31  Good       63.3    58   335  4.34  4.35  2.75
#  6 0.24  Very Good  62.8    57   336  3.94  3.96  2.48
#  7 0.24  Very Good  62.3    57   336  3.95  3.98  2.47
#  8 0.26  Very Good  61.9    55   337  4.07  4.11  2.53
#  9 0.22  Fair       65.1    61   337  3.87  3.78  2.49
# 10 0.23  Very Good  59.4    61   338  4     4.05  2.39
# # ... with 53,930 more rows
# Warning message:
# Unknown columns: `zebra`
like image 40
IceCreamToucan Avatar answered Nov 09 '22 17:11

IceCreamToucan