Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Replace entire expression that contains a specific string

I have data frame that has a column with large number of file names like:

d <- c("harry11_scott80_norm.avi","harry11_norm.avi","harry11_scott80_lpf.avi", 
       "joel51_lpf.avi","rich82_joel51_lpf.avi")

I want R to replace all expressions with two people names like harry11_scott80_norm.avi with the expression incongruent and all the ones with single person name like harry11_norm.avi with congruent. I could use gsub to do that:

dd <- gsub("harry11_scott80_norm.avi", "incongruent", d) 

but I got a lot of those names, so it would be a very clunky solution. So ideally I want to replace the ENTIRE expression that contains strings like _scott80_ with "incongruent". I thought that gsub can do this, but when I run it:

dd <- gsub("_scott80_", "incongruent", d)

it returns with harry11incongruentnorm.avi, which is obviously because it simply replace the exact string match. I recon there is some way to tell gsub to replace expression entirely that contains selected string, but I can't find it.

There was a question In R, how do I replace a string that contains a certain pattern with another string?, but I am not sure how to use agrep in this context.


EDIT: Side bonus question - based on @GSee answer, is there any function that allows you to pass a list of strings that you want to replace? For example, gsub(c(".*_scott80_.*", ".*_harry11_.*"), "incongruent", d) won't work.

like image 247
Geek On Acid Avatar asked Nov 07 '12 18:11

Geek On Acid


1 Answers

Here's one way

> gsub(".*_scott80_.*", "incongruent", d)
[1] "incongruent"           "harry11_norm.avi"      "incongruent"          
[4] "joel51_lpf.avi"        "rich82_joel51_lpf.avi"

Or with grep

> d[grep("_scott80_", d)] <- "incongruent"
> d
[1] "incongruent"           "harry11_norm.avi"      "incongruent"          
[4] "joel51_lpf.avi"        "rich82_joel51_lpf.avi"

To address your edit, I believe this will do it (using | to mean "or")

gsub(".*(_scott80_|_harry11_).*", "incongruent", d)

Of course, you don't have any strings in d that match "_harry11_"

like image 110
GSee Avatar answered Oct 29 '22 16:10

GSee