I'm trying to create a new column based on another, using case_when to give different outputs based on the value of each row.
I start with df <- data.frame(a=c("abc", "123", "abc", "123"))
And want to generate a new column b like so
#> a b
#> 1 abc letter
#> 2 123 number
#> 3 abc letter
#> 4 123 number
I've tried df %>% mutate(b = case_when(startsWith(a, "a") ~ "letter", startsWith(a, "1") ~ "number")) but it only gives an error. Can someone show me how to get different values for column b based on the first letter of the row in column a?
According to ?startsWith
x -vector of character string whose “starts” are considered.
So, startsWith expects the class to be character and here it is factor class. Converting it to character class would solve the issue
library(dplyr)
df %>%
mutate(b = case_when(startsWith(as.character(a), "a") ~ "letter",
TRUE ~ "number"))
# a b
#1 abc letter
#2 123 number
#3 abc letter
#4 123 number
The default behavior of data.frame would be stringsAsFactors = TRUE. If we specify stringsAsFactors = FALSE, the 'a' column will be character class
Another option is str_detect to create a logical expression by checking if the character from the start (^) of the string is a digit ([0-9])
library(stringr)
library(dplyr)
df %>%
mutate(b = c("letter", "number")[1+str_detect(a, "^[0-9]")])
# a b
#1 abc letter
#2 123 number
#3 abc letter
# 123 number
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With