I have strings with placeholders like "{variant 1|variant 2}", where "|" means "or"; I want to get all variants of strings without placeholders. For example, if I use string "{a|b{c|d}}", I get strings "a", "bc" and "bd".
I tried to get it by regular expression \{([^{}])\} (it gets last level, in my case {c|d}) with recursion, but I have two strings on next step: {a|bc} and {a|bd}, which will produce "a", "bc", "a", "bd". May be I need to create some graph or tree structure?
I also want to ask about (?[^{}|$]*)
Why there are "$"? I removed it, and have no effect.
Assuming that |{} are reserved characters (not allowed as content of your variants, the following is a regex approach to the problem. Please note, that writing a simple state machine parser would be the better choice.
<?php // Using PHP5.3 syntax
// PCRE Recursive Pattern
// http://php.net/manual/en/regexp.reference.recursive.php
$string = "This test can be {very {cool|bad} in random order|or be just text} ddd {a|b{c|d}} bar {a|b{c{d|e|f}}} lala {b|c} baz";
if (preg_match_all('#\{((?>[^{}]+)|(?R))+\}#', $string, $matches, PREG_SET_ORDER)) {
foreach ($matches as $match) {
// $match[0] == "{a|b{c|d}}" | "{a|b{c{d|e|f}}}" | "{b|c}"
// have some fun splitting them up
// I'd suggest walking the characters and building a tree
// a simpler (slower, uglyer) approach:
// remove {}
$set = substr($match[0], 1, -1);
while (strpos($set, '{') !== false) {
// explode and replace nested {}
// reserved characters: "{" and "}" and "|"
// (?<=^|\{|\|) -- a substring needs to begin with "|" or "{" or be the start of the string,
// "?<=" is a positive look behind assertion - the content is not captured
// (?<prefix>[^{|]+) -- is the prefix, preceeding literal string (anything but reserved characters)
// \{(?<inner>[^{}]+)\} -- is the content of a nested {} group, excluding the "{" and "}"
// (?<postfix>[^|}$]*) -- is the postfix, trailing literal string (anything but reserved characters)
// readable: <begin-delimiter><possible-prefix>{<nested-group>}<possible-postfix>
$set = preg_replace_callback('#(?<=^|\{|\|)(?<prefix>[^{}|]*)\{(?<inner>[^{}]+)\}(?<postfix>[^{}|$]*)#', function($m) {
$inner = explode('|', $m['inner']);
return $m['prefix'] . join($inner, $m['postfix'] . '|' . $m['prefix']) . $m['postfix'];
}, $set);
}
// $items = explode('|', $set);
echo "$match[0] expands to {{$set}}\n";
}
}
/*
OUTPUT:
{very {cool|bad} in random order|or be just text} expands to {very cool in random order|very bad in random order|or be just text}
{a|b{c|d}} expands to {a|bc|bd}
{a|b{c{d|e|f}}} expands to {a|bcd|bce|bcf}
{b|c} expands to {b|c}
*/
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With