I feel like there should be an efficient way to mutate new columns with dplyr
using case_when
and contains
, but cannot get it to work.
I understand using case_when
within mutate
is "somewhat experimental" (as in this post), but would be grateful for any suggestions.
Doesn't work:
library(tidyverse)
set.seed(1234)
x <- c("Black", "Blue", "Green", "Red")
df <- data.frame(a = 1:20,
b = sample(x,20, replace=TRUE))
df <- df %>%
mutate(group = case_when(.$b(contains("Bl")) ~ "Group1",
case_when(.$b(contains("re", ignore.case=TRUE)) ~ "Group2")
)
case_when.Rd. This function allows you to vectorise multiple if_else() statements. It is an R equivalent of the SQL CASE WHEN statement.
case_when with a single case To do this syntactically, we simply type the name of the function: case_when() . Then, inside the parenthesis, there is an expression with a “left hand side” and a “right hand side,” which are separated by a tilde ( ~ ).
We can use grep
df %>%
mutate(group = case_when(grepl("Bl", b) ~ "Group1",
grepl("re", b, ignore.case = TRUE) ~"Group2"))
# a b group
#1 1 Black Group1
#2 2 Green Group2
#3 3 Green Group2
#4 4 Green Group2
#5 5 Red Group2
#6 6 Green Group2
#7 7 Black Group1
#8 8 Black Group1
#9 9 Green Group2
#10 10 Green Group2
#11 1 Green Group2
#12 2 Green Group2
#13 3 Blue Group1
#14 4 Red Group2
#15 5 Blue Group1
#16 6 Red Group2
#17 7 Blue Group1
#18 8 Blue Group1
#19 9 Black Group1
#20 10 Black Group1
Wanted to add some examples using str_detect
with a paste0
function that would also make concatenating common groups a cinch. Say you're working with gapminder or an other country df.
interest <- c("Austria", "Belgium", "Bulgaria", "Croatia", "Cyprus",
"Czech Republic", "Denmark", "Estonia", "Finland",
"France", "Germany", "Greece", "Hungary", "Ireland",
"Italy", "Latvia", "Lithuania", "Luxembourg","Malta",
"The Netherlands", "Poland","Portugal", "Romania",
"Slovakia", "Slovenia","Spain", "Sweden","United Kingdom")
EU <- paste0(countrycode::countryname(
sourcevar = interest, destination = "iso2c"),
sep = "|", collapse = "")
df%<>%mutate(Region=case_when(
str_detect(Country, "AT|BE|BG|HR|CY|CZ|DK|EE|FI|FR|DE|GR|HU|IE|
IT|LV|LT|LU|MT|NL|PL|PT|RO|SK|SI|ES|SE|GB|UK|G8")~ "EU",
TRUE ~ "Not EU")) ```
You'll need to load `library(magittr)` to get `%<>%` the compound pipe to work, it's basically an abbreviation of `df<-funs(df)`
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With