Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regex matching closing bracket not in quotes

Tags:

.net

regex

I am trying to build a regex for matching strings like

1.) $(Something)
2.) $(SomethingElse, ")")
3.) $(SomethingElse, $(SomethingMore), Bla)
4.) $$(NoMatch) <-- should not match
5.) $$$(ShouldMatch) <-- so basically $$ will produce $

in a text.

EDIT: The words Something, SomethingElse, NoMatch, ShouldMatch can be even other words - they are names of macros. The strings i try to match are "macro calls" which can occur in a text and should be replaced by their result. I need the regex just for syntax highlighting. A complete macro call should be highlighted. Number 3 is currently not so import. Number 1 and 2 are required to work. It's fine if number 4 and 5 will not work like written above but that any $( after a $ will not match.

Currently I have

(?<!\$)+\$\(([^)]*)\)

Which matches any $( if there is no leading $, which could be fine if I will not find another way to apply the $$ structure.

The next step I would like to get done is to ignore the closing bracket if it is in quotes. How could I achieve this?

EDIT So that if I have an input like

Some text, doesn't matter what. And a $(MyMacro, ")") which will be replaced.

The complete '$(MyMacro, ")")' will get highlighted.

I already have this expression

"(?:\\\\|\\"|[^"])*"

for quotes including escaping of quotes. But I don't know how to apply this in a way to ignore everything between them...

P.S. I am using .NET to apply the regular expressions. So balanced groups will be supported. I just don't know how to apply all this.

like image 778
Daniel Bişar Avatar asked Mar 06 '13 16:03

Daniel Bişar


People also ask

How do you match brackets in regex?

[[\]] will match either bracket. In some regex dialects (e.g. grep) you can omit the backslash before the ] if you place it immediately after the [ (because an empty character class would never be useful): [][] .

What is difference [] and () in regex?

[] denotes a character class. () denotes a capturing group. [a-z0-9] -- One character that is in the range of a-z OR 0-9. (a-z0-9) -- Explicit capture of a-z0-9 .

How do you escape brackets in regex?

How do you escape brackets in string? The solution to avoid this problem, is to use the backslash escape character.

What does bracket do in regex?

By placing part of a regular expression inside round brackets or parentheses, you can group that part of the regular expression together. This allows you to apply a quantifier to the entire group or to restrict alternation to part of the regex.


1 Answers

You can use an expression like this:

(?<! \$ )                     # not preceded by $
\$ (?: \$\$ )?                # $ or $$$
\(                            # opening (

(?>                           # non-backtracking atomic group
  (?>                         # non-backtracking atomic group
    [^"'()]+                  # literals, spaces, etc
  | " (?: [^"\\]+ | \\. )* "  # double quoted string with escapes
  | ' (?: [^'\\]+ | \\. )* '  # single quoted string with escapes
  | (?<open>       \( )       # open += 1
  | (?<close-open> \) )       # open -= 1, only if open > 0 (balancing group)
  )*
)

(?(open) (?!) )               # fail if open > 0

\)                            # final )

Which can be quoted as above. For example in C#:

var regex = new Regex(@"(?x)    # enable eXtended mode (ignore spaces, comments)
(?<! \$ )                       # not preceded by $
\$ (?: \$\$ )                   # $ or $$$
\(                              # opening (

(?>                             # non-backtracking atomic group
  (?>                           # non-backtracking atomic group
    [^""'()]+                   # literals, spaces, etc
  | "" (?: [^""\\]+ | \\. )* "" # double quoted string with escapes
  | '  (?: [^'\\]+ | \\. )*  '  # single quoted string with escapes
  | (?<open>       \( )         # open += 1
  | (?<close-open> \) )         # open -= 1, only if open > 0 (balancing group)
  )*
)

(?(open) (?!) )                 # fail if open > 0

\)                              # final )
");
like image 80
Qtax Avatar answered Oct 24 '22 14:10

Qtax