I've got data, a character vector (eventually I'll collapse it, so I don't care if it stays a vector or if it's treated as a single string), a vector of patterns, and a vector of replacements. I want each pattern in the data to be replaced by its respective replacement. I got it done with a stringr
and a for loop, but is there a more R-like way to do it?
require(stringr)
start_string <- sample(letters[1:10], 10)
my_pattern <- c("a", "b", "c", "z")
my_replacement <- c("[this was an a]", "[this was a b]", "[this was a c]", "[no z!]")
str_replace(start_string, pattern = my_pattern, replacement = my_replacement)
# bad lengths, doesn't work
str_replace(paste0(start_string, collapse = ""),
pattern = my_pattern, replacement = my_replacement)
# vector output, not what I want in this case
my_result <- start_string
for (i in 1:length(my_pattern)) {
my_result <- str_replace(my_result,
pattern = my_pattern[i], replacement = my_replacement[i])
}
> my_result
[1] "[this was a c]" "[this was an a]" "e" "g" "h" "[this was a b]"
[7] "d" "j" "f" "i"
# This is what I want, but is there a better way?
In my case, I know each pattern will occur at most once, but not every pattern will occur. I know I could use str_replace_all
if patterns might occur more than once; I hope a solution would also provide that option. I'd also like a solution that uses my_pattern
and my_replacement
so that it could be part of a function with those vectors as arguments.
I'll bet there's another way to do this, but my first thought was gsubfn:
my_repl <- function(x){
switch(x,a = "[this was an a]",
b = "[this was a b]",
c = "[this was a c]",
z = "[this was a z]")
}
library(gsubfn)
start_string <- sample(letters[1:10], 10)
gsubfn("a|b|c|z",my_repl,x = start_string)
If the patterns you are search for a acceptably valid names for list elements, this will also work:
names(my_replacement) <- my_pattern
gsubfn("a|b|c|z",as.list(my_replacement),start_string)
Edit
But frankly, if I really had to do this a lot in my own code, I would probably just do the for
loop thing, wrapped in a function. Here's a simple version using sub
and gsub
rather than the functions from stringr:
vsub <- function(pattern,replacement,x,all = TRUE,...){
FUN <- if (all) gsub else sub
for (i in seq_len(min(length(pattern),length(replacement)))){
x <- FUN(pattern = pattern[i],replacement = replacement[i],x,...)
}
x
}
vsub(my_pattern,my_replacement,start_string)
But of course, one of the reasons that there isn't a built-in function for this that's well known is probably that sequential replacements like this can't be pretty fragile, because they are so order dependent:
vsub(rev(my_pattern),rev(my_replacement),start_string)
[1] "i" "[this w[this was an a]s [this was an a] c]"
[3] "[this was an a]" "g"
[5] "j" "d"
[7] "f" "[this w[this was an a]s [this was an a] b]"
[9] "h" "e"
Here's an option based on gregrexpr
, regmatches
, and regmatches<-
. Do be aware that there are limits to the length of regular expressions that can be matched, so this won't work if you try to match too many long patterns with it.
replaceSubstrings <- function(patterns, replacements, X) {
pat <- paste(patterns, collapse="|")
m <- gregexpr(pat, X)
regmatches(X, m) <-
lapply(regmatches(X,m),
function(XX) replacements[match(XX, patterns)])
X
}
## Try it out
patterns <- c("cat", "dog")
replacements <- c("tiger", "coyote")
sentences <- c("A cat", "Two dogs", "Raining cats and dogs")
replaceSubstrings(patterns, replacements, sentences)
## [1] "A tiger" "Two coyotes"
## [3] "Raining tigers and coyotes"
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With