Here's a trivial example of what I'm trying to do:
iris %>%
mutate(Species2 = ifelse(Species %in% c("setosa", "virginica"), "other", as.character(Species)) %>% as.factor) %>%
str
# 'data.frame': 150 obs. of 6 variables:
# $ Sepal.Length: num 5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
# $ Sepal.Width : num 3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
# $ Petal.Length: num 1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
# $ Petal.Width : num 0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
# $ Species : Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 1 1 1 1 1 ...
# $ Species2 : Factor w/ 2 levels "Other","versicolor": 1 1 1 1 1 1 1 1 1 1 ...
However, if I want to do multiple merges, I'd end up with deeply nested ifelse
statements, which I'm trying to avoid. What's the most elegant way to do this? Preferably I can incorporate the solution into a dplyr pipeline.
You can use match
:
species.keep <- c("setosa", "virginica", "other")
iris %>% mutate(Species2 = species.keep[match(Species, species.keep, nomatch=3)])
We use the nomatch
argument to match
to map to "other"
at the last position of our species.keep
vector for any species that are not in previous positions. Note this assumes "other"
is not a valid species. You'll have to add the as.factor
etc., but this should get to what you want. match
is the baseline mapping function in R.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With