Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using gsub for a specific occurrence in a string in R?

Tags:

r

gsub

I have two strings:

mystring1 <- c("hello i am a cat.  just kidding, i'm not a cat i'm a cat.  dogs are the best animal.  not cats!")

mystring2 <- c("hello i am a cat.  just kidding, i'm not a cat i'm a cat.  but i have a cat friend that is a cat.")

I want to change the third occurrence of the word cat in both strings to dog.

Ideally, string1 and string2 would read:

mystring1
[1] "hello i am a cat.  just kidding, i'm not a cat i'm a dog.  dogs are the best animal.  not cats!"

mystring2
[1] "hello i am a cat.  just kidding, i'm not a cat i'm a dog.  but i have a cat friend that is a cat."

What is the best way of doing this? Up until now I have only used gsub to replace characters but I don't know if this can be used to replace specific occurrences of a character.

like image 798
icedcoffee Avatar asked Jun 15 '20 13:06

icedcoffee


People also ask

What does gsub () do in R?

The gsub() function in R is used to replace the strings with input strings or values. Note that, you can also use the regular expression with gsub() function to deal with numbers. This is data that has 'R' written multiple times.

How do I replace all occurrences of a string in R?

We can replace all occurrences of a particular character using gsub() function. Parameters: string is the input string.

How do you get special characters in GSUB?

To be able to use special characters within a function such as gsub, we have to add two backslashes (i.e. \\) in front of the special character.

How do you replace a character value in R?

How to replace a single character in a string on the R DataFrame column (find and replace)? To replace a first or all occurrences of a single character in a string use gsub(), sub(), str_replace(), str_replace_all() and functions from dplyr package of R.


2 Answers

You could use

mystring1 <- c("hello i am a cat.  just kidding, i'm not a cat i'm a cat.  dogs are the best animal.  not cats!")
mystring2 <- c("hello i am a cat.  just kidding, i'm not a cat i'm a cat.  but i have a cat friend that is a cat who knows a cat knowing a cat.")

sub("((cat.*?){2})\\bcat\\b", "\\1dog", mystring1, perl=TRUE)

which gives

> sub("((cat.*?){2})\\bcat\\b", "\\1dog", c(mystring1, mystring2), perl=TRUE)
[1] "hello i am a cat.  just kidding, i'm not a cat i'm a dog.  dogs are the best animal.  not cats!"                                
[2] "hello i am a cat.  just kidding, i'm not a cat i'm a dog.  but i have a cat friend that is a cat who knows a cat knowing a cat."
like image 79
Martin Gal Avatar answered Sep 30 '22 16:09

Martin Gal


You can use gsubfn

library(gsubfn)
p <- proto(fun = function(this, x) if(count == 3) 'dog' else x)
gsubfn('cat', p, c(mystring1, mystring2))

# [1] "hello i am a cat.  just kidding, i'm not a cat i'm a dog.  dogs are the best animal.  not cats!"  
# [2] "hello i am a cat.  just kidding, i'm not a cat i'm a dog.  but i have a cat friend that is a cat."

Or, if it needs to be surrounded by word boundaries,

gsubfn('\\bcat\\b', p, c(mystring1, mystring2), perl = TRUE)

# [1] "hello i am a cat.  just kidding, i'm not a cat i'm a dog.  dogs are the best animal.  not cats!"  
# [2] "hello i am a cat.  just kidding, i'm not a cat i'm a dog.  but i have a cat friend that is a cat."
like image 22
IceCreamToucan Avatar answered Sep 30 '22 16:09

IceCreamToucan