I need a regular expression to select all the text between two outer brackets.
Example: some text(text here(possible text)text(possible text(more text)))end text
Result: (text here(possible text)text(possible text(more text)))
The way we solve this problem—i.e., the way we match a literal open parenthesis '(' or close parenthesis ')' using a regular expression—is to put backslash-open parenthesis '\(' or backslash-close parenthesis '\)' in the RE. This is another example of an escape sequence.
balanced parentheses is not a regular language.
[] denotes a character class. () denotes a capturing group. (a-z0-9) -- Explicit capture of a-z0-9 . No ranges.
I want to add this answer for quickreference. Feel free to update.
.NET Regex using balancing groups.
\((?>\((?<c>)|[^()]+|\)(?<-c>))*(?(c)(?!))\)
Where c
is used as the depth counter.
Demo at Regexstorm.com
PCRE using a recursive pattern.
\((?:[^)(]+|(?R))*+\)
Demo at regex101; Or without alternation:
\((?:[^)(]*(?R)?)*+\)
Demo at regex101; Or unrolled for performance:
\([^)(]*+(?:(?R)[^)(]*)*+\)
Demo at regex101; The pattern is pasted at (?R)
which represents (?0)
.
Perl, PHP, Notepad++, R: perl=TRUE, Python: Regex package with (?V1)
for Perl behaviour.
Ruby using subexpression calls.
With Ruby 2.0 \g<0>
can be used to call full pattern.
\((?>[^)(]+|\g<0>)*\)
Demo at Rubular; Ruby 1.9 only supports capturing group recursion:
(\((?>[^)(]+|\g<1>)*\))
Demo at Rubular (atomic grouping since Ruby 1.9.3)
JavaScript API :: XRegExp.matchRecursive
XRegExp.matchRecursive(str, '\\(', '\\)', 'g');
JS, Java and other regex flavors without recursion up to 2 levels of nesting:
\((?:[^)(]+|\((?:[^)(]+|\([^)(]*\))*\))*\)
Demo at regex101. Deeper nesting needs to be added to pattern.
To fail faster on unbalanced parenthesis drop the +
quantifier.
Java: An interesting idea using forward references by @jaytea.
Reference - What does this regex mean?
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With