Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Use case_when and startsWith to selectively mutate by row

Tags:

r

dplyr

I'm trying to create a new column based on another, using case_when to give different outputs based on the value of each row.

I start with df <- data.frame(a=c("abc", "123", "abc", "123"))

And want to generate a new column b like so

#>     a      b
#> 1 abc letter
#> 2 123 number
#> 3 abc letter
#> 4 123 number

I've tried df %>% mutate(b = case_when(startsWith(a, "a") ~ "letter", startsWith(a, "1") ~ "number")) but it only gives an error. Can someone show me how to get different values for column b based on the first letter of the row in column a?

like image 734
pgcudahy Avatar asked Oct 17 '25 02:10

pgcudahy


1 Answers

According to ?startsWith

x -vector of character string whose “starts” are considered.

So, startsWith expects the class to be character and here it is factor class. Converting it to character class would solve the issue

library(dplyr)
df %>%
      mutate(b = case_when(startsWith(as.character(a), "a") ~ "letter",
                 TRUE ~ "number"))
#    a      b
#1 abc letter
#2 123 number
#3 abc letter
#4 123 number

The default behavior of data.frame would be stringsAsFactors = TRUE. If we specify stringsAsFactors = FALSE, the 'a' column will be character class


Another option is str_detect to create a logical expression by checking if the character from the start (^) of the string is a digit ([0-9])

library(stringr)
library(dplyr)
df %>% 
    mutate(b = c("letter", "number")[1+str_detect(a, "^[0-9]")])
#    a      b
#1 abc letter
#2 123 number
#3 abc letter
# 123 number
like image 179
akrun Avatar answered Oct 18 '25 15:10

akrun