I just started using R again, and I was wondering is there a way to replace part of a string using wildcards.
For example:
say I have
S1 <- "aaaaaaaaa[aaaaa]aaaa[bbbbbbb]aaaa"
and I want to replace everything within square brackets with 'x', such that the new string is
"aaaaaaaaa[x]aaaa[x]aaaa"
Is this possible to do in R?
Please note what is in the square bracket can be of variable length.
The sub() function in R. The sub() function in R is used to replace the string in a vector or a data frame with the input or the specified string.
A simple regex would be like
\\[.+?\\]
Example http://regex101.com/r/xE1rL1/1
Example Usage
s1 <- 'aaaaaaaaa[aaaaa]aaaa[bbbbbbb]aaaa'
gsub("\\[.+?\\]", "[x]", s1)
## [1] "aaaaaaaaa[x]aaaa[x]aaaa"
Regular expression
\\[
matches opening [
.+?
non greedy matching of anything
\\]
matches closing ]
EDIT
For safety, if nothing is present in the the []
, then the regex can be slightly modified as
s1 <- 'aaaaaaaaa[]aaaa[bbbbbbb]aaaa'
gsub("\\[.*?\\]", "[x]", s1)
##[1] "aaaaaaaaa[x]aaaa[x]aaaa"
Could also try qdapRegex
package which has a special method for such problems: rm_square
library(qdapRegex)
S1 <- "aaaaaaaaa[aaaaa]aaaa[bbbbbbb]aaaa"
rm_square(S1, replacement = "[x]")
## [1] "aaaaaaaaa[x]aaaa[x]aaaa"
Will work the same for empty brackets
S1 <- "aaaaaaaaa[]aaaa[bbbbbbb]aaaa"
rm_square(S1, replacement = "[x]")
## [1] "aaaaaaaaa[x]aaaa[x]aaaa"
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With