I have data frame that has a column with large number of file names like:
d <- c("harry11_scott80_norm.avi","harry11_norm.avi","harry11_scott80_lpf.avi",
"joel51_lpf.avi","rich82_joel51_lpf.avi")
I want R to replace all expressions with two people names like harry11_scott80_norm.avi
with the expression incongruent
and all the ones with single person name like harry11_norm.avi
with congruent
. I could use gsub
to do that:
dd <- gsub("harry11_scott80_norm.avi", "incongruent", d)
but I got a lot of those names, so it would be a very clunky solution. So ideally I want to replace the ENTIRE expression that contains strings like _scott80_
with "incongruent". I thought that gsub
can do this, but when I run it:
dd <- gsub("_scott80_", "incongruent", d)
it returns with harry11incongruentnorm.avi
, which is obviously because it simply replace the exact string match. I recon there is some way to tell gsub
to replace expression entirely that contains selected string, but I can't find it.
There was a question In R, how do I replace a string that contains a certain pattern with another string?, but I am not sure how to use agrep
in this context.
EDIT: Side bonus question - based on @GSee answer, is there any function that allows you to pass a list of strings that you want to replace? For example, gsub(c(".*_scott80_.*", ".*_harry11_.*"), "incongruent", d)
won't work.
Here's one way
> gsub(".*_scott80_.*", "incongruent", d)
[1] "incongruent" "harry11_norm.avi" "incongruent"
[4] "joel51_lpf.avi" "rich82_joel51_lpf.avi"
Or with grep
> d[grep("_scott80_", d)] <- "incongruent"
> d
[1] "incongruent" "harry11_norm.avi" "incongruent"
[4] "joel51_lpf.avi" "rich82_joel51_lpf.avi"
To address your edit, I believe this will do it (using |
to mean "or")
gsub(".*(_scott80_|_harry11_).*", "incongruent", d)
Of course, you don't have any strings in d
that match "_harry11_"
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With