I have a vector of strings and I want to replace one common substring in all the strings with different substrings. I'm doing this in R. For example:
input=c("I like fruits","I like you","I like dudes")
# I need to do something like this
newStrings=c("You","We","She")
gsub("I",newStrings,input)
so that the output should look like:
"You like fruits"
"We like you"
"She like dudes"
However, gsub uses only the first string in newStrings. Any suggestions? Thanks
You can use stringr
:
stringr::str_replace_all(input, "I" ,newStrings)
[1] "You like fruits" "We like you"
[3] "She like dudes"
or as suggested by @ David Arenburg:
stringi::stri_replace_all_fixed(input, "I", newStrings)
Benchmrk
library(stringi)
library(stringr)
library(microbenchmark)
set.seed(123)
x <- stri_rand_strings(1e3, 10)
y <- stri_rand_strings(1e3, 1)
identical(stringi::stri_replace_all_fixed(x, "I", y), stringr::str_replace_all(x, fixed("I") , y))
# [1] TRUE
identical(stringi::stri_replace_all_fixed(x, "I", y), diag(sapply(y, gsub, pattern = "I", x = x, fixed = TRUE)))
# [1] TRUE
identical(stringi::stri_replace_all_fixed(x, "I", y), mapply(gsub, "I", y, x, USE.NAMES = FALSE, fixed = TRUE))
# [1] TRUE
microbenchmark("stingi: " = stringi::stri_replace_all_fixed(x, "I", y),
"stringr (optimized): " = stringr::str_replace_all(x, fixed("I") , y),
"base::mapply (optimized): " = mapply(gsub, "I", y, x, USE.NAMES = FALSE, fixed = TRUE),
"base::sapply (optimized): " = diag(sapply(y, gsub, pattern = "I", x = x, fixed = TRUE)))
# Unit: microseconds
# expr min lq mean median uq max neval cld
# stingi: 132.156 137.1165 171.5822 150.3960 194.2345 460.145 100 a
# stringr (optimized): 801.894 828.7730 947.1813 912.6095 968.7680 2716.708 100 a
# base::mapply (optimized): 2827.104 2946.9400 3211.9614 3031.7375 3123.8940 8216.360 100 a
# base::sapply (optimized): 402349.424 476545.9245 491665.8576 483410.3290 513184.3490 549489.667 100 b
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With