Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Get all nested curly braces

It is possible to get all content in nested curly braces from string? For example:

The {quick} brown fox {jumps {over the} lazy} dog

So i need:

  • quick
  • over the
  • jumps {over the} lazy

Better in this sequence, from most nested.

like image 707
droptheplot Avatar asked Apr 27 '13 23:04

droptheplot


2 Answers

Solution

The regex below will allow you to grab the content of all the nested curly braces. Note that this assumes that the nested curly braces are balanced; otherwise, it is hard to define what the answer should be.

(?=\{((?:[^{}]++|\{(?1)\})++)\})

The result will be in capturing group 1.

DEMO

Note that the order is not as specified in the question, though. The order printed out is defined by the order of appearance of opening curly bracket {, which means that the content of the outer most pair will be printed out first.

Explanation

Ignoring the zero-width positive look-ahead (?=pattern) for now, and let us focus on the pattern inside, which is:

\{((?:[^{}]++|\{(?1)\})++)\}

The part between 2 literal curly braces - ((?:[^{}]++|\{(?1)\})++) will matches 1 or more instances of either:

  • a non-empty non-curly-brace sequence of characters [^{}]++, or
  • recursively match a block enclosed by {}, which may contain many other non-curly-brace sequences or other blocks.

The pattern above alone can match text that doesn't contain {}, which we don't need. Therefore, we make sure a match is a block enclosed by {} by the pair of curly braces {} at 2 ends: \{((?:[^{}]++|\{(?1)\})++)\}.

Since we want the content inside the all the nested curly braces, we need to prevent the engine from consuming the text. That's where the use of the zero-width positive look-ahead comes in to play.

It is not very efficient since you will redo the match for the nesting braces, but I doubt there is any other general solution with regex that can handle it efficiently.

Normal code can handle everything efficiently in one pass, and is recommended if you are going to extend your requirement in the future.

like image 167
nhahtdh Avatar answered Oct 03 '22 03:10

nhahtdh


A simple solution wihtout using regular expression in one pass:

$str = 'The {quick} brown fox {jumps {over the} lazy} dog';

$result = parseCurlyBrace($str);

echo '<pre>' . print_r($result,true) . '</pre>';

function parseCurlyBrace($str) {

  $length = strlen($str);
  $stack  = array();
  $result = array();

  for($i=0; $i < $length; $i++) {

     if($str[$i] == '{') {
        $stack[] = $i;
     }

     if($str[$i] == '}') {
        $open = array_pop($stack);
        $result[] = substr($str,$open+1, $i-$open-1);
     }
  }

  return $result;
}
like image 22
Frédéric Clausset Avatar answered Oct 03 '22 02:10

Frédéric Clausset