Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

dplyr mutate replace value(s) in a single column based on condition(s) in an efficient way

Tags:

r

dplyr

I would like to replace the values c t, o p of the column b with c_t, o_p respectively. I have achieved the task using the following approaches.

d <- data.frame(a = c(5,6,3,7,4,3,8,3,2,7), 
                b = c('c t','c_t','d','o p','o_p','c m','c_t','d','o t','o_p'))
# Way-1
d %>% 
    mutate(b = replace(b, b == 'c t', 'c_t')) %>% 
    mutate(b = replace(b, b == 'o p', 'o_p'))

# Way-2
d %>% mutate(b = replace(b, b == 'c t', 'c_t'), 
             b = replace(b, b == 'o p', 'o_p'))

Output:

#    a   b
# 1  5 c_t
# 2  6 c_t
# 3  3   d
# 4  7 o_p
# 5  4 o_p
# 6  3 c m
# 7  8 c_t
# 8  3   d
# 9  2 o t
# 10 7 o_p

However, I would like to know if there are any other efficient approaches to achieve this? I would only need to do this for selected values but not all values having space.

like image 727
Prradep Avatar asked Jun 20 '17 14:06

Prradep


People also ask

How do I change the value of a column in dplyr?

Use mutate() and its other verbs mutate_all() , mutate_if() and mutate_at() from dplyr package to replace/update the values of the column (string, integer, or any type) in R DataFrame (data. frame).

How do I change a column value in R?

To replace a column value in R use square bracket notation df[] , By using this you can update values on a single column or on all columns. To refer to a single column use df$column_name .

What does the dplyr verb mutate do?

Overview. dplyr is a grammar of data manipulation, providing a consistent set of verbs that help you solve the most common data manipulation challenges: mutate() adds new variables that are functions of existing variables. select() picks variables based on their names.


2 Answers

dplyr::recode is a quick way to change particular values:

library(dplyr)

d <- data.frame(a = c(5,6,3,7,4,3,8,3,2,7),
                b = c('c t','c_t','d','o p','o_p','c m','c_t','d','o t','o_p'))


d %>% mutate(b = recode(b, 'c t' = 'c_t', 'o p' = 'o_p'))
#>    a   b
#> 1  5 c_t
#> 2  6 c_t
#> 3  3   d
#> 4  7 o_p
#> 5  4 o_p
#> 6  3 c m
#> 7  8 c_t
#> 8  3   d
#> 9  2 o t
#> 10 7 o_p
like image 128
alistaire Avatar answered Oct 10 '22 23:10

alistaire


We can use sub to match the space (" "), replace with the _ in column 'b'

d %>%
    mutate(b = sub(" ", "_", b))
#   a   b
#1  5 c_t
#2  6 c_t
#3  3   d
#4  7 o_p
#5  4 o_p
#6  3 c_t
#7  8 c_t
#8  3   d
#9  2 o_p
#10 7 o_p

Based on the OP' update,

d %>% 
   mutate(b = as.character(b), 
          b = ifelse(b %in% c('c t', 'o p'), sub(" ", "_", b), b) )
like image 24
akrun Avatar answered Oct 11 '22 00:10

akrun