Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Replace entire string anywhere in dataframe based on partial match with dplyr

Tags:

r

dplyr

I'm struggling to find the right dplyr code to use grepl or an equivalent to replace values throughout an entire data frame.

i.e.: any cell that contains 'mazda' in it, should have it's entire content replaced with the new string 'A car'

after lots of searching online, the closest I came was:

The emphasis being on applying it to ALL columns.

library(dplyr)
mtcars$carnames <- rownames(mtcars)  # dummy data to test on

This line does the trick for entire sting being an exact match:

mtcars %>% replace(., (.)=='Mazda RX4', "A car")

but my grepl attempt replaces the entire column with "A car" for some reason.

mtcars %>% replace(., grepl('Mazda', (.)), "A car")
like image 414
Mark Avatar asked Mar 03 '23 15:03

Mark


1 Answers

library(dplyr)
mtcars %>% mutate_if(grepl('Mazda',.), ~replace(., grepl('Mazda', .), "A car"))

To understand why you first replace failed see the difference between 'Mazda RX4'==mtcars and grepl('Mazda', mtcars), since you used grepl, replace uses

replace replaces the values in x with indices given in list by those given in values. If necessary, the values in values are recycled.

Now we can use your first method if we make sure to get a suitable output using sapply for example

mtcars %>% replace(., sapply(mtcars, function(.) grepl('Mazda',.)), "A car")

Update:

TO replace multiple patterns we can use stringr::str_replace_all

library(stringr)
library(dplyr)
mtcars %>% mutate_if(str_detect(., 'Mazda|Merc'), 
                    ~str_replace_all(., c("Mazda.*" = "A car", "Merc.*" = "B car")))
like image 65
A. Suliman Avatar answered Apr 12 '23 22:04

A. Suliman