In a PHP script, what regex should I use to check for mismatched parentheses in a string? Things that I want to allow include:
Things I want to prevent:
Thanks!
Update: You guys all rock. Doing this with a regex seemed trickier than it should have, and these kinds of 2nd level answers are what makes stackoverflow beautiful. Thanks for the links and the pseudocode. I'm not sure who to give the answer to, so I apologize to everyone whose answers I can't accept.
Regex is not the right tool for the job. Scan a string manually.
Pseudo-code:
depth = 0
for character in some_string:
depth += character == '('
depth -= character == ')'
if depth < 0:
break
if depth != 0:
print "unmatched parentheses"
You can do this with a regular expression -- PCRE, as used by PHP, allows recursive patterns. The PHP Manual gives an example that is almost exactly what you want:
\(((?>[^()]+)|(?R))*\)
This matches any correctly parenthesised substring as long as it begins and ends with parentheses. If you want to ensure the entire string is balanced, allowing strings like "wiggedy(wiggedy)(wiggedy(wack))", here's what I came up with:
^((?:[^()]|\((?1)\))*+)$
Here's an explanation of the pattern, which may be more illuminating than obfuscatory:
^ Beginning of the string ( Start the "balanced substring" group (to be called recursively) (?: Start the "minimal balanced substring" group [^()] Minimal balanced substring is either a non-paren character | or \((?1)\) a set of parens containing a balanced substring ) Finish the "minimal balanced substring" group * Our balanced substring is a maximal sequence of minimal balanced substrings + Don't backtrack once we've matched a maximal sequence ) Finish the "balanced substring" pattern $ End of the string
There are lots of considerations of efficiency and correctness that come up with these sorts of regexes. Be careful.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With