Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R's equivalent of string.replace() in python

I need to replace certain values of a character vector:

x <- data.frame(Strings = c("one", "two","three","four","five","four","five","four","five","two","thre","two","three","two","three"), stringsAsFactors = FALSE)
> x
   Strings
1      one
2      two
3    three
4     four
5     five
6     four
7     five
8     four
9     five
10     two
11   three
12     two
13   three
14     two
15   three

In python, I would do:

x["Strings"].replace(["one", "two", "thre","three"], ["One","Two","Three","Three"], inplace=True)

But in r the function replace() doens't work the same easy-way. There is plenty of solutions for a string replace in Stackoverflow, but no one with this simplicity. Is this possible in r?

like image 633
Chris Avatar asked Feb 20 '19 15:02

Chris


People also ask

What to use instead of replace in Python?

Use the translate() method to replace multiple different characters. You can create the translation table specified in translate() by the str.

Is there a Replace function for strings in Python?

Python String replace() MethodThe replace() method replaces a specified phrase with another specified phrase. Note: All occurrences of the specified phrase will be replaced, if nothing else is specified.

What is replace () in Python?

replace() is an inbuilt function in the Python programming language that returns a copy of the string where all occurrences of a substring are replaced with another substring. Syntax : string.replace(old, new, count) Parameters : old – old substring you want to replace.

What does \r in Python do?

Learn More. In Python strings, the backslash "\" is a special character, also called the "escape" character. It is used in representing certain whitespace characters: "\t" is a tab, "\n" is a newline, and "\r" is a carriage return.


1 Answers

If all you wanted to do is capitalize the first letter of every word, we can use sub:

x$new <- sub('^([a-z])', '\\U\\1', x$Strings, perl = TRUE)

Output:

   Strings   new
1      one   One
2      two   Two
3    three Three
4     four  Four
5     five  Five
6     four  Four
7     five  Five
8     four  Four
9     five  Five
10     two   Two
11    thre  Thre
12     two   Two
13   three Three
14     two   Two
15   three Three

If there is already a list of old and new words for replacement, we can use str_replace_all, which has a (kind of) similar style as the python example OP posted:

library(stringr)

pattern <- c("one", "two", "thre", "three")
replacements <- c("One", "Two", "Three", "Three")

named_vec <- setNames(replacements, paste0("\\b", pattern, "\\b"))

x$new <- str_replace_all(x$Strings, named_vec)

or with match or hashmap:

library(dplyr)

x$new <- coalesce(replacements[match(x$Strings, pattern)], x$new)


library(hashmap)

hash_lookup = hashmap(pattern, replacements)
x$new <- coalesce(hash_lookup[[x$Strings]], x$new)

Output:

   Strings   new
1      one   One
2      two   Two
3    three Three
4     four  four
5     five  five
6     four  four
7     five  five
8     four  four
9     five  five
10     two   Two
11    thre Three
12     two   Two
13   three Three
14     two   Two
15   three Three
like image 171
acylam Avatar answered Oct 13 '22 20:10

acylam