I'm trying to create a new column based on another, using case_when
to give different outputs based on the value of each row.
I start with df <- data.frame(a=c("abc", "123", "abc", "123"))
And want to generate a new column b
like so
#> a b
#> 1 abc letter
#> 2 123 number
#> 3 abc letter
#> 4 123 number
I've tried df %>% mutate(b = case_when(startsWith(a, "a") ~ "letter", startsWith(a, "1") ~ "number"))
but it only gives an error. Can someone show me how to get different values for column b based on the first letter of the row in column a?
According to ?startsWith
x -vector of character string whose “starts” are considered.
So, startsWith
expects the class to be character
and here it is factor
class. Converting it to character
class would solve the issue
library(dplyr)
df %>%
mutate(b = case_when(startsWith(as.character(a), "a") ~ "letter",
TRUE ~ "number"))
# a b
#1 abc letter
#2 123 number
#3 abc letter
#4 123 number
The default behavior of data.frame
would be stringsAsFactors = TRUE
. If we specify stringsAsFactors = FALSE
, the 'a' column will be character
class
Another option is str_detect
to create a logical expression by checking if the character from the start (^
) of the string is a digit ([0-9]
)
library(stringr)
library(dplyr)
df %>%
mutate(b = c("letter", "number")[1+str_detect(a, "^[0-9]")])
# a b
#1 abc letter
#2 123 number
#3 abc letter
# 123 number
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With