I want to make a subset of my data, by selecting columns, like below:
select(df, col1, col2, col3, col4)
But sometimes I have a slightly different data set, with only col1, col2 and col4.
How can I use select(), and If a column doesn't exist, it just continues without giving an error?
So it would give a dataset with col1, col2 and col4 (and skip col3). If I just run the above select() line, I get this error:
Error in overscope_eval_next(overscope, expr) : object 'col3' not found
df[, names(df) %in% c('col1', 'col2', 'col3', 'col4')]
You can use the one_of()
select helper from dplyr and pass the column names as strings. It will just issue a warning for columns that don't exist.
library(dplyr)
select(mtcars, one_of(c("mpg", "disp", "foo")))
#> Warning: Unknown variables: `foo`
#> mpg disp
#> Mazda RX4 21.0 160.0
#> Mazda RX4 Wag 21.0 160.0
#> Datsun 710 22.8 108.0
#> Hornet 4 Drive 21.4 258.0
#> Hornet Sportabout 18.7 360.0
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With