Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Solving regular expression recursive strings

The Problem

I could match this string

(xx)

using this regex

\([^()]*\)

But it wouldn't match

(x(xx)x)

So, this regex would

\([^()]*\([^()]*\)[^()]*\)

However, this would fail to match

(x(x(xx)x)x)

But again, this new regex would

[^()]*\([^()]*\([^()]*\)[^()]*\)[^()]*

This is where you can notice the replication, the entire regex pattern of the second regex after the first \( and before the last \) is copied and replaces the center most [^()]*. Of course, this last regex wouldn't match

(x(x(x(xx)x)x)x)

But, you could always copy replace the center most [^()]* with [^()]*\([^()]*\)[^()]* like we did for the last regex and it'll capture more (xx) groups. The more you add to the regex the more it can handle, but it will always be limited to how much you add.

So, how do you get around this limitation and capture a group of parenthesis (or any two characters for that matter) that can contain extra groups within it?

Falsely Assumed Solutions

I know you might think to just use

\(.*\)

But this will match all of

(xx)xx)

when it should only match the sub-string (xx).

Even this

\([^)]*\)

will not match pairs of parentheses that have pairs nested like

(xx(xx)xx)

From this, it'll only match up to (xx(xx).

Is it possible?

So is it possible to write a regex that can match groups of parentheses? Or is this something that must be handled by a routine?

Edit

The solution must work in the JavaScript implementation of Regular Expressions

like image 466
Sam Avatar asked Dec 15 '12 04:12

Sam


Video Answer


1 Answers

If you want to match only if the round brackets are balanced you cannot do it by regex itself..

a better way would be to

1>match the string using \(.*\)

2>count the number of (,) and check if they are equal..if they are then you have the match

3>if they are not equal use \([^()]*\) to match the required string

like image 76
Anirudha Avatar answered Oct 19 '22 03:10

Anirudha