Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using RegEx to balance match parenthesis

Tags:

c#

.net

regex

I am trying to create a .NET RegEx expression that will properly balance out my parenthesis. I have the following RegEx expression:

func([a-zA-Z_][a-zA-Z0-9_]*)\(.*\) 

The string I am trying to match is this:

"test -> funcPow((3),2) * (9+1)" 

What should happen is Regex should match everything from funcPow until the second closing parenthesis. It should stop after the second closing parenthesis. Instead, it is matching all the way to the very last closing parenthesis. RegEx is returning this:

"funcPow((3),2) * (9+1)" 

It should return this:

"funcPow((3),2)" 

Any help on this would be appreciated.

like image 846
Icemanind Avatar asked Oct 26 '11 03:10

Icemanind


People also ask

Can you use parentheses in regex?

By placing part of a regular expression inside round brackets or parentheses, you can group that part of the regular expression together. This allows you to apply a quantifier to the entire group or to restrict alternation to part of the regex. Only parentheses can be used for grouping.

How do you match a literal parenthesis in a regular expression?

The way we solve this problem—i.e., the way we match a literal open parenthesis '(' or close parenthesis ')' using a regular expression—is to put backslash-open parenthesis '\(' or backslash-close parenthesis '\)' in the RE. This is another example of an escape sequence.

How do you escape parentheses in regex?

Since parentheses are also used for capturing and non-capturing groups, we have to escape the opening parenthesis with a backslash. An explanation of how literalRegex works: / — Opens or begins regex. \( — Escapes a single opening parenthesis literal.

How do you match brackets in regex?

Brackets indicate a set of characters to match. Any individual character between the brackets will match, and you can also use a hyphen to define a set. You can use the ^ metacharacter to negate what is between the brackets.


1 Answers

Regular Expressions can definitely do balanced parentheses matching. It can be tricky, and requires a couple of the more advanced Regex features, but it's not too hard.

Example:

var r = new Regex(@"     func([a-zA-Z_][a-zA-Z0-9_]*) # The func name      \(                      # First '('         (?:                          [^()]               # Match all non-braces         |         (?<open> \( )       # Match '(', and capture into 'open'         |         (?<-open> \) )      # Match ')', and delete the 'open' capture         )+         (?(open)(?!))       # Fails if 'open' stack isn't empty!      \)                      # Last ')' ", RegexOptions.IgnorePatternWhitespace); 

Balanced matching groups have a couple of features, but for this example, we're only using the capture deleting feature. The line (?<-open> \) ) will match a ) and delete the previous "open" capture.

The trickiest line is (?(open)(?!)), so let me explain it. (?(open) is a conditional expression that only matches if there is an "open" capture. (?!) is a negative expression that always fails. Therefore, (?(open)(?!)) says "if there is an open capture, then fail".

Microsoft's documentation was pretty helpful too.

like image 178
7 revs, 2 users 94% Avatar answered Oct 22 '22 17:10

7 revs, 2 users 94%