Two days ago I started working on a code parser and I'm stuck.
How can I split a string by commas that are not inside brackets, let me show you what I mean:
I have this string to parse:
one, two, three, (four, (five, six), (ten)), seven
I would like to get this result:
array(
"one";
"two";
"three";
"(four, (five, six), (ten))";
"seven"
)
but instead I get:
array(
"one";
"two";
"three";
"(four";
"(five";
"six)";
"(ten))";
"seven"
)
How can I do this in PHP RegEx.
Thank you in advance !
You can do that easier:
preg_match_all('/[^(,\s]+|\([^)]+\)/', $str, $matches)
But it would be better if you use a real parser. Maybe something like this:
$str = 'one, two, three, (four, (five, six), (ten)), seven';
$buffer = '';
$stack = array();
$depth = 0;
$len = strlen($str);
for ($i=0; $i<$len; $i++) {
$char = $str[$i];
switch ($char) {
case '(':
$depth++;
break;
case ',':
if (!$depth) {
if ($buffer !== '') {
$stack[] = $buffer;
$buffer = '';
}
continue 2;
}
break;
case ' ':
if (!$depth) {
continue 2;
}
break;
case ')':
if ($depth) {
$depth--;
} else {
$stack[] = $buffer.$char;
$buffer = '';
continue 2;
}
break;
}
$buffer .= $char;
}
if ($buffer !== '') {
$stack[] = $buffer;
}
var_dump($stack);
Hm... OK already marked as answered, but since you asked for an easy solution I will try nevertheless:
$test = "one, two, three, , , ,(four, five, six), seven, (eight, nine)";
$split = "/([(].*?[)])|(\w)+/";
preg_match_all($split, $test, $out);
print_r($out[0]);
Output
Array
(
[0] => one
[1] => two
[2] => three
[3] => (four, five, six)
[4] => seven
[5] => (eight, nine)
)
You can't, directly. You'd need, at minimum, variable-width lookbehind, and last I knew PHP's PCRE only has fixed-width lookbehind.
My first recommendation would be to first extract parenthesized expressions from the string. I don't know anything about your actual problem, though, so I don't know if that will be feasible.
I can't think of a way to do it using a single regex, but it's quite easy to hack together something that works:
function process($data)
{
$entries = array();
$filteredData = $data;
if (preg_match_all("/\(([^)]*)\)/", $data, $matches)) {
$entries = $matches[0];
$filteredData = preg_replace("/\(([^)]*)\)/", "-placeholder-", $data);
}
$arr = array_map("trim", explode(",", $filteredData));
if (!$entries) {
return $arr;
}
$j = 0;
foreach ($arr as $i => $entry) {
if ($entry != "-placeholder-") {
continue;
}
$arr[$i] = $entries[$j];
$j++;
}
return $arr;
}
If you invoke it like this:
$data = "one, two, three, (four, five, six), seven, (eight, nine)";
print_r(process($data));
It outputs:
Array
(
[0] => one
[1] => two
[2] => three
[3] => (four, five, six)
[4] => seven
[5] => (eight, nine)
)
Clumsy, but it does the job...
<?php
function split_by_commas($string) {
preg_match_all("/\(.+?\)/", $string, $result);
$problem_children = $result[0];
$i = 0;
$temp = array();
foreach ($problem_children as $submatch) {
$marker = '__'.$i++.'__';
$temp[$marker] = $submatch;
$string = str_replace($submatch, $marker, $string);
}
$result = explode(",", $string);
foreach ($result as $key => $item) {
$item = trim($item);
$result[$key] = isset($temp[$item])?$temp[$item]:$item;
}
return $result;
}
$test = "one, two, three, (four, five, six), seven, (eight, nine), ten";
print_r(split_by_commas($test));
?>
I feel that its worth noting, that you should always avoid regular expressions when you possibly can. To that end, you should know that for PHP 5.3+ you could use str_getcsv(). However, if you're working with files (or file streams), such as CSV files, then the function fgetcsv() might be what you need, and its been available since PHP4.
Lastly, I'm surprised nobody used preg_split(), or did it not work as needed?
Maybe a bit late but I've made a solution without regex which also supports nesting inside brackets. Anyone let me know what you guys think:
$str = "Some text, Some other text with ((95,3%) MSC)";
$arr = explode(",",$str);
$parts = [];
$currentPart = "";
$bracketsOpened = 0;
foreach ($arr as $part){
$currentPart .= ($bracketsOpened > 0 ? ',' : '').$part;
if (stristr($part,"(")){
$bracketsOpened ++;
}
if (stristr($part,")")){
$bracketsOpened --;
}
if (!$bracketsOpened){
$parts[] = $currentPart;
$currentPart = '';
}
}
Gives me the output:
Array
(
[0] => Some text
[1] => Some other text with ((95,3%) MSC)
)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With