Assuming we have a string containing different parentheses (by parentheses in this context I mean (
, [
, and {
) the string could also contain other content like so {[balanced(parenthesis)]}
but the rest of the content will largely be ignored. Would it be possible to use regular expression to control that all different parenthesis are:
It turns out this problem has been solved by Kobi in this blog post of his. For simply balancing different types of brackets, the solution is quite simple and elegant, and not as crazy as the answers in this question, which has a more complex grammar.
Let me quote the essential part of the blog post below:
[...] who says I must push what I matched to the stack? What if wanted to push an arbitrary string to the stack? When I see an open bracket I really want to push a closing bracket – but how can I?
The trick is to use a look-ahead: find the next occurrence of the character I want to push, and push it:
{(?=.*?(<Stack>}))
Next, when I want to match a closing brace, I already have the right one in stack. Using this approach, here’s a regex that matches tokens with matching balanced parentheses of 3 different typed:
( [^(){}\[\]]+ | \( (?=[^)]* (?<Stack> \) ) ) | \[ (?=[^\]]* (?<Stack> \] ) ) | \{ (?=[^}]* (?<Stack> \} ) ) | \k<Stack> (?<-Stack>) )+? (?(Stack) (?!))
Of course, this approach does have limitations – you may not find the character you’d like to push (which might be a good thing – it allows you to fail early). It also gets much trickier if you want to balance something more complicated than constant strings, but that’s another subject.
It is possible to do this with the regex below, albeit a bit ugly.
This is a matching regex rather than a validation regex, but you can add anchor to turn it into a validation one by adding anchors at the beginning and the end.
(?:
(?>[^(){}\[\]]+)
|
(?<open>
(?=(?<mr>\())(?<mc>)(?<ms>)\(|
(?=(?<mc>\{))(?<mr>)(?<ms>)\{|
(?=(?<ms>\[))(?<mc>)(?<mr>)\[
)
|
(?:
(?<-open>
(?!\k<mc>)\}
|
(?!\k<mr>)\)
|
(?!\k<ms>)\]
)
(?<-mr>)(?<-mc>)(?<-ms>)
)
)+(?(open)(?!))
Since we can't read the top stack, we would have to emulate it by the 3 capturing groups mr
, mc
and ms
. The number of items in mr
, mc
, ms
and open
are always the same. When the stack open
is not empty, only one out of the 3 capturing groups contains the corresponding opening bracket, the other 2 capture an empty string. The non-empty string capturing group is always the type of the bracket at the top of the stack.
This allows us to match the corresponding closing bracket by asserting that the corresponding captured group can't be matched, e.g. (?!\k<mc>)\}
. We know that the group can't be empty (since it has to pass the check on (?<-open>)
first). There are only 2 cases left:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With