I would like to use purrr to iteratively run several string replacements on a dataframe column with the gsub() function.
This is the example dataframe:
df <- data.frame(Year = "2019",
Text = c(rep("a aa", 5),
rep("a bb", 3),
rep("a cc", 2)))
> df
Year Text
1 2019 a aa
2 2019 a aa
3 2019 a aa
4 2019 a aa
5 2019 a aa
6 2019 a bb
7 2019 a bb
8 2019 a bb
9 2019 a cc
10 2019 a cc
This is how I would normally run the string replacement, and the desired result.
df$Text <- gsub("aa", "One", df$Text, fixed = T)
df$Text <- gsub("bb", "Two", df$Text, fixed = T)
df$Text <- gsub("cc", "Three", df$Text, fixed = T)
> df
Year Text
1 2019 a One
2 2019 a One
3 2019 a One
4 2019 a One
5 2019 a One
6 2019 a Two
7 2019 a Two
8 2019 a Two
9 2019 a Three
10 2019 a Three
However this is unrealistic to use as the list of string replacements grows, so I tried to use purrr to iterate such changes using a list of patterns and replacements but I've only managed to produce error messages. I expect the code to iterate through text_pattern and text_replacement and run gsub on df$Text for each pair of pattern/replacement. The example is below along with the error messages.
text_pattern <- c("aa", "bb", "cc")
text_replacement <- c("One", "Two", "Three")
walk2(text_pattern, text_replacement, function(...){
gsub(text_pattern, text_replacement, df$Text, fixed = F)
}
)
Warning messages:
1: In gsub(text_former, text_replace, df$Text, fixed = F) :
argument 'pattern' has length > 1 and only the first element will be used
2: In gsub(text_former, text_replace, df$Text, fixed = F) :
argument 'replacement' has length > 1 and only the first element will be used
3: In gsub(text_former, text_replace, df$Text, fixed = F) :
argument 'pattern' has length > 1 and only the first element will be used
4: In gsub(text_former, text_replace, df$Text, fixed = F) :
argument 'replacement' has length > 1 and only the first element will be used
5: In gsub(text_former, text_replace, df$Text, fixed = F) :
argument 'pattern' has length > 1 and only the first element will be used
6: In gsub(text_former, text_replace, df$Text, fixed = F) :
argument 'replacement' has length > 1 and only the first element will be used
Is it possible to accomplish this using functions from purrr? Or alternatively am I trying to use the wrong tool and is there a different function I should be using?
We can use reduce2
library(purrr)
library(stringr)
df$Text <- reduce2(text_pattern, text_replacement, ~ str_replace(..1, ..2, ..3),
.init = df$Text)
df$Text
#[1] "a One" "a One" "a One" "a One" "a One" "a Two" "a Two" "a Two" "a Three" "a Three"
Or without using anonymous function call
reduce2(text_pattern, text_replacement, .init = df$Text, str_replace)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With