I am looking for a regex to replace all terms in parentheses unless the parentheses are within square brackets.
e.g.
(matches) #match
[(do not match)] #should not match
[[does (not match)]] #should not match
I current have:
[^\]]\([^()]*\) #Not a square bracket, an opening bracket, any non-bracket character and a closing bracket.
However this is still matching words within the square brackets.
I have also created a rubular page of my progress so far: http://rubular.com/r/gG22pFk2Ld
A regex is not going to cut it for you if you can nest the square brackets (see this related question).
I think you can only do this with a regex if (a) you only allow one level of square brackets and (b) you assume all square brackets are properly matched. In that case
\([^()]*\)(?![^\[]*])
is sufficient - it matches any parenthesised expression not followed by an unpaired ]
. You need (b) because of the limitations of negative lookbehind (only fixed length strings in 1.9, and not allowed at all in 1.8), which mean you are stuck matching (match)]
even if you don't want to.
So basically if you need to nest, or to allow unmatched brackets, you should ditch the regex and look at the answer to the question I linked to above.
This is a type of expression you cannot parse using a pure-regex approach, because you need to keep track of the current nesting/state_if_in_square_bracket (so you don't have a type 3 language anymore).
However, depending on the exact circumstances, you can parse it with multiple regexes or simple parsers. Example approaches:
[
/[[
or ]
/]]
, change the state
when such a square bracket is
encountered, replace () in a
sub-string if in
"not_in_square_bracket" stateThe complexity of your solution also depends on the detail if escaping ] is allowed.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With