I need to replace certain values of a character vector:
x <- data.frame(Strings = c("one", "two","three","four","five","four","five","four","five","two","thre","two","three","two","three"), stringsAsFactors = FALSE)
> x
Strings
1 one
2 two
3 three
4 four
5 five
6 four
7 five
8 four
9 five
10 two
11 three
12 two
13 three
14 two
15 three
In python, I would do:
x["Strings"].replace(["one", "two", "thre","three"], ["One","Two","Three","Three"], inplace=True)
But in r the function replace()
doens't work the same easy-way. There is plenty of solutions for a string replace in Stackoverflow, but no one with this simplicity. Is this possible in r?
Use the translate() method to replace multiple different characters. You can create the translation table specified in translate() by the str.
Python String replace() MethodThe replace() method replaces a specified phrase with another specified phrase. Note: All occurrences of the specified phrase will be replaced, if nothing else is specified.
replace() is an inbuilt function in the Python programming language that returns a copy of the string where all occurrences of a substring are replaced with another substring. Syntax : string.replace(old, new, count) Parameters : old – old substring you want to replace.
Learn More. In Python strings, the backslash "\" is a special character, also called the "escape" character. It is used in representing certain whitespace characters: "\t" is a tab, "\n" is a newline, and "\r" is a carriage return.
If all you wanted to do is capitalize the first letter of every word, we can use sub
:
x$new <- sub('^([a-z])', '\\U\\1', x$Strings, perl = TRUE)
Output:
Strings new
1 one One
2 two Two
3 three Three
4 four Four
5 five Five
6 four Four
7 five Five
8 four Four
9 five Five
10 two Two
11 thre Thre
12 two Two
13 three Three
14 two Two
15 three Three
If there is already a list of old and new words for replacement, we can use str_replace_all
, which has a (kind of) similar style as the python example OP posted:
library(stringr)
pattern <- c("one", "two", "thre", "three")
replacements <- c("One", "Two", "Three", "Three")
named_vec <- setNames(replacements, paste0("\\b", pattern, "\\b"))
x$new <- str_replace_all(x$Strings, named_vec)
or with match
or hashmap
:
library(dplyr)
x$new <- coalesce(replacements[match(x$Strings, pattern)], x$new)
library(hashmap)
hash_lookup = hashmap(pattern, replacements)
x$new <- coalesce(hash_lookup[[x$Strings]], x$new)
Output:
Strings new
1 one One
2 two Two
3 three Three
4 four four
5 five five
6 four four
7 five five
8 four four
9 five five
10 two Two
11 thre Three
12 two Two
13 three Three
14 two Two
15 three Three
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With