I needed to retain the words enclosed in brackets and delete the others in the following string.
(a(b(c)d)(e)f)
So what I expected would be (((c))(e)). To delete a, b, d, f, I tried the 'not followed by' regex.
str <- "(a(b(c)d)(e)f)"
gsub("([a-z]+)(?!\\))", "", str) #(sub. anything that isn't followed by a ")" )
The message shows my regex in invalid. As I can see, the brackets in the second part of the regex "(?!\))" don't match properly. As for my editor, the first "(" matches with the immediately following ")", which is not meant to be a closure bracket (the one to its right is). I could make out just this error from my regex. Can you please tell me what actually is wrong? Is there any other way to do this?
In two steps, and using positive lookaheads:
str1 <- gsub("\\([a-z](?=\\()", "\\(", str, perl=TRUE)
str1
# [1] "(((c)d)(e)f)"
str2 <- gsub("\\)[a-z](?=\\))", "\\)", str1, perl=TRUE)
str2
# [1] "(((c))(e))"
Edit: it turns out you can even do it in one:
gsub("([\\(\\)])[a-z](?=\\1)", "\\1", str, perl=TRUE)
# [1] "(((c))(e))"
I agree with @Dason's comment:
st <- "(a(b(c)d)(e)f)"
while(grepl("\\([a-z]+\\(",st)) {
st <- sub("\\([a-z]+(\\(.+\\))[a-z]+\\)","\\1",st)
}
> st
[1] "(c)(e)"
Written on my iPad :-)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With