Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Replace multiple letters with accents with gsub

Tags:

regex

r

gsub

of course I could replace specific arguments like this:

    mydata=c("á","é","ó")
    mydata=gsub("á","a",mydata)
    mydata=gsub("é","e",mydata)
    mydata=gsub("ó","o",mydata)
    mydata

but surely there is a easier way to do this all in onle line, right? I dont find the gsub help to be very comprehensive on this.

like image 790
Joschi Avatar asked Mar 06 '13 17:03

Joschi


4 Answers

Use the character translation function

chartr("áéó", "aeo", mydata)
like image 195
kith Avatar answered Oct 13 '22 15:10

kith


An interesting question! I think the simplest option is to devise a special function, something like a "multi" gsub():

mgsub <- function(pattern, replacement, x, ...) {
  if (length(pattern)!=length(replacement)) {
    stop("pattern and replacement do not have the same length.")
  }
  result <- x
  for (i in 1:length(pattern)) {
    result <- gsub(pattern[i], replacement[i], result, ...)
  }
  result
}

Which gives me:

> mydata <- c("á","é","ó")
> mgsub(c("á","é","ó"), c("a","e","o"), mydata)
[1] "a" "e" "o"
like image 27
Theodore Lytras Avatar answered Oct 13 '22 14:10

Theodore Lytras


Maybe this can be usefull:

iconv('áéóÁÉÓçã', to="ASCII//TRANSLIT")
[1] "aeoAEOca"
like image 27
Rcoster Avatar answered Oct 13 '22 16:10

Rcoster


You can use stringi package to replace these characters.

> stri_trans_general(c("á","é","ó"), "latin-ascii")

[1] "a" "e" "o"
like image 17
Maciej Avatar answered Oct 13 '22 16:10

Maciej