Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Recursive nested matching pairs of curly braces in Ruby regex

Tags:

regex

ruby

I have the following string:

The {quick} brown fox {jumps {over {deep} the} {sfsdf0} lazy} dog {sdfsdf1 {sdfsdf2}

And PHP Regular Expression:

/(?=\{((?:[^{}]+|\{(?1)\})+)\})/g

It yields the following matches:

[5-10]  `quick`
[23-60] `jumps {over {deep} the} {sfsdf} lazy`
[30-45] `over {deep} the`
[36-40] `deep`
[48-54] `sfsdf0`
[76-83] `sdfsdf2`

See: http://regex101.com/r/fD3iZ2.

I'm trying to get the equivalent working in Ruby, but I'm having a problem with (?1)… resulting in an undefined group option error:

str = "The {quick} brown fox {jumps {over {deep} the} {sfsdf} lazy} dog {sdfsdf {sdfsdf}"
str.scan /(?=\{((?:[^{}]+|\{(?1)\})+)\})/

SyntaxError: undefined group option: /(?=\{((?:[^{}]+|\{(?1)\})+)\})/

See: http://fiddle.re/n6w4n.

Coincidently, I get the same sort of error in Javascript and Python.

My regex foo is nearly exhausted today, any help very much appreciated.

like image 479
Dom Avatar asked Oct 21 '13 05:10

Dom


1 Answers

Ruby uses a different syntax for recursion: \g<1> replaces (?1). So try

(?=\{((?:[^{}]++|\{\g<1>\})++)\})

I also made the quantifiers possessive to avoid excessive backtracking in case of unbalanced braces.

irb(main):003:0> result = str.scan(/(?=\{((?:[^{}]++|\{\g<1>\})++)\})/)
=> [["quick"], ["jumps {over {deep} the} {sfsdf} lazy"], ["over {deep} the"], 
["deep"], ["sfsdf"], ["sdfsdf"]]
like image 103
Tim Pietzcker Avatar answered Oct 13 '22 00:10

Tim Pietzcker