Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using mutate and starts_with

Tags:

r

dplyr

I would like to change the value of certain variables depending on whether they start with a certain string-sequence.

Example:

df <- data.frame(var1 = c("12345", "12345", "12345", "23456", "23456"))
df %>% mutate(var2 = ifelse(starts_with("123"), "ok", "not ok"))

All values starting with "123" should be changed into "ok". How can I combine starts_with() with mutate()?

Thanks!

like image 434
D. Studer Avatar asked Sep 05 '25 05:09

D. Studer


2 Answers

We can use case_when

library(dplyr)
library(stringr)
df %>% 
  mutate(var2 = case_when(str_detect(var1, '^123') ~ 'ok',
                TRUE ~ 'not ok'))
#   var1   var2
#1 12345     ok
#2 12345     ok
#3 12345     ok
#4 23456 not ok
#5 23456 not ok

Or with ifelse in base R

ifelse(grepl('^123', df$var1), 'ok', 'not ok')
#[1] "ok"     "ok"     "ok"     "not ok" "not ok"

data

df <- data.frame(var1 = c("12345", "12345", "12345", "23456", "23456"), 
      stringsAsFactors = FALSE)
like image 190
akrun Avatar answered Sep 07 '25 19:09

akrun


starts_with is used to select columns which start with a particular name. Here you can use base R startsWith instead.

library(dplyr)
df %>% mutate(var2 = ifelse(startsWith(var1, "123"), "ok", "not ok"))

#   var1   var2
#1 12345     ok
#2 12345     ok
#3 12345     ok
#4 23456 not ok
#5 23456 not ok

However, we can also do this in base R and without ifelse.

df$var2 <- c('not ok', 'ok')[startsWith(df$var1, '123') + 1]

Or with grepl

df$var2 <- c('not ok', 'ok')[grepl('^123', df$var1) + 1]

data

startsWith need data to be character, use stringsAsFactors = FALSE.

df <- data.frame(var1 = c("12345", "12345", "12345", "23456", "23456"), 
      stringsAsFactors = FALSE)
like image 21
Ronak Shah Avatar answered Sep 07 '25 21:09

Ronak Shah