Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

str_replace_all replacing named vector elements iteratively not all at once

Let's say I have a long character string: pneumonoultramicroscopicsilicovolcanoconiosis. I'd like to use stringr::str_replace_all to replace certain letters with others. According to the documentation, str_replace_all can take a named vector and replaces the name with the value. That works fine for 1 replacement, but for multiple it seems to do it iteratively, so the result is a replacement of the prelast iteration. I'm not sure this is the intended behaviour.

library(tidyverse)
text_string = "developer"
text_string %>% 
  str_replace_all(c(e ="X")) #this works fine
[1] "dXvXlopXr"
text_string %>% 
  str_replace_all(c(e ="p", p = "e")) #not intended behaviour
[1] "develoeer"

Desired result:

[1] "dpvploepr"

Which I get by introducing a new character:

text_string %>% 
  str_replace_all(c(e ="X", p = "e", X = "p"))

It's a usable workaround but hardly generalisable. Is this a bug or are my expectations wrong?

I'd like to also be able to replace n letters with n other letters simultaneously, preferably using either two vectors (like "old" and "new") or a named vector as input.

reprex edited for easier human reading

like image 685
biomiha Avatar asked Jan 09 '18 13:01

biomiha


2 Answers

I'm working on a package to deal with the type of problem. This is safer than the qdap::mgsub function because it does not rely on placeholders. It fully supports regex as the matching and the replacement. You provide a named list where the names are the strings to match on and their value is the replacement.

devtools::install_github("bmewing/mgsub")
library(mgsub)
mgsub("developer",list("e" ="p", "p" = "e"))
#> [1] "dpvploepr"

qdap::mgsub(c("e","p"),c("p","e"),"developer")
#> [1] "dpvploppr"
like image 94
Mark Avatar answered Sep 22 '22 20:09

Mark


My workaround would be to take advantage of the fact that str_replace_all can take functions as an input for the replacement.

library(stringr)
text_string = "developer"
pattern <- "p|e"
fun <- function(query) {
    if(query == "e") y <- "p"
    if(query == "p") y <- "e"
    return(y)
}

str_replace_all(text_string, pattern, fun)

Of course, if you need to scale up, I would suggest to use a more sophisticated function.

like image 28
Benjamin Schwetz Avatar answered Sep 23 '22 20:09

Benjamin Schwetz