I want to make a subset of my data, by selecting columns, like below:
select(df, col1, col2, col3, col4) 
But sometimes I have a slightly different data set, with only col1, col2 and col4.
How can I use select(), and If a column doesn't exist, it just continues without giving an error?
So it would give a dataset with col1, col2 and col4 (and skip col3). If I just run the above select() line, I get this error:
Error in overscope_eval_next(overscope, expr) : object 'col3' not found
                df[, names(df) %in% c('col1', 'col2', 'col3', 'col4')]
                        You can use the one_of() select helper from dplyr and pass the column names as strings. It will just issue a warning for columns that don't exist.
library(dplyr)
select(mtcars, one_of(c("mpg", "disp", "foo")))
#> Warning: Unknown variables: `foo`
#>                      mpg  disp
#> Mazda RX4           21.0 160.0
#> Mazda RX4 Wag       21.0 160.0
#> Datsun 710          22.8 108.0
#> Hornet 4 Drive      21.4 258.0
#> Hornet Sportabout   18.7 360.0
                        If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With