The string I work on looks like that:
abc {def ghi {jkl mno} pqr stv} xy z
And I need to put what figure parentheses are containing in tags, so it should looks like this
abc <tag>def ghi <tag>jkl mno</tag> pqr stv</tag> xy z
I’ve tried
'#(?<!\pL)\{ ( ([^{}]+) | (?R) )* \}(?!\pL)#xu'
but what I get is just <tag>xy z</tag>
. Help please, what am I doing wrong?
Nested structures are by definition too complicated for regular expressions (yes, PCRE supports recursion, but that does not help for this replacement-problem). There are two possible options for you (using regular expressions anyway). Firstly, you could simply replace opening brackets by opening tags and the same for closing tags. This, however, will convert unmatched brackets as well:
$str = preg_replace('/\{/', '<tag>', $str);
$str = preg_replace('/\}/', '</tag>', $str);
Another option is to only replace matching {
and }
, but then you have to do it repeatedly, because one call to preg_replace
cannot replace multiple nested levels:
do
{
$str = preg_replace('/\{([^{]*?)\}/', '<tag>$1</tag>', $str, -1, $count);
}
while ($count > 0)
EDIT: While PCRE supports recursion with (?R)
this will most likely not help with a replacement. The reason is that, if a capturing group is repeated, its reference will only contain the last capturing (i.e. when matching /(a|b)+/
in aaaab
, $1
will contain b
). I suppose that this is the same for recursion. That is why you can only replace the innermost match because it's the last match of the capturing group within the recursion. Likewise, you could not try to capture {
and }
with recursion and replace these, because they might also be matched an arbitrary number of times and only the last match will be replaced.
Just matching a correct nested syntax and then replacing the innermost or outermost matching brackets will not help either (with one preg_replace
call), because multiple matches will never overlap (so if 3 nested brackets have been found, the inner 2 brackets themselves will be disregarded for further matches).
How about two steps:
s!{!<tag>!g;
s!}!</tag>!g;
(perl format; translate to your format as appropriate)
or maybe this:
1 while s!{([^{}]*)}!<tag>$1</tag>!g;
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With