Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to gsub string with any partially matched string

Tags:

regex

r

I have my string as:

cc <- c("Bacter;httyh;ttyyyt", "Bacteria;hhhdh;hhgt;hhhg", "Bacter;hhhhdj;gg;dd", "Bactr;hhhg;ggj", "Bctg;hhgg;hhj")

I would like to replace any text matching Bact before first ; and replace it with Bctr.

I tried: gsub("[Bact*]+;", "Bctr", cc)

So, the result I would like is

Bctr;httyh;ttyyyt, Bctr;hhhdh;hhgt;hhhg, Bctr;hhhhdj;gg;dd, Bctr;hhhg;ggj, Bctg;hhgg;hhj

Any suggestion what I am missing here?

like image 328
Yamuna_dhungana Avatar asked Oct 15 '22 11:10

Yamuna_dhungana


2 Answers

We can use sub and replace from "Bact" till the first semi-colon with "Bctr";

sub("Bact.*?;", "Bctr;", cc)
#[1] "Bctr;httyh;ttyyyt" "Bctr;hhhdh;hhgt;hhhg" "Bctr;hhhhdj;gg;dd"  "Bctr;hhhg;ggj"

*? is used for lazy matching making it to match as few characters as possible. So here it stops after matching with first semi-colon.

The difference would be clear if we remove ? from it.

sub("Bact.*;", "Bctr;", cc)
#[1] "Bctr;ttyyyt" "Bctr;hhhg"   "Bctr;dd"     "Bctr;ggj"

Now it matches till the last semi-colon in cc.

like image 97
Ronak Shah Avatar answered Nov 01 '22 12:11

Ronak Shah


ifelse(grepl("Bact", cc),
       paste0("Bctr", substring(cc,
                                attr(regexpr("Bact.*?;", cc), "match.length"),
                                nchar(cc))),
       cc)
#[1] "Bctr;httyh;ttyyyt"    "Bctr;hhhdh;hhgt;hhhg" "Bctr;hhhhdj;gg;dd"   
#[4] "Bctr;hhhg;ggj"        "Bctg;hhgg;hhj"  
like image 20
d.b Avatar answered Nov 01 '22 12:11

d.b