Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regex to match comma not between grouping symbols

I need a regular expression that will match a comma that is NOT between either a '[' and ']' or '(' and ')' or '{' and '}'. Other grouping symbols do not matter. I have tried to figure it out but I cannot come up with anything that accomplishes this.

The regex is to be used with the PHP preg_split function to split a string on the matched commas.

An example string containing commas and grouping symbols:

<div>Hello<div>,@func[opt1,opt2],{,test},blahblah

The string should split up as follows:

1: '<div>Hello<div>'
2: '@func[opt1,opt2]'
3: '{,test}'
4: 'blahblah'

And I just thought of this, but at this point all grouping symbols are guaranteed to have matching symbols, incase that helps.

Any help would be GREATLY appriciated =)

like image 412
GotCake Avatar asked May 26 '11 01:05

GotCake


2 Answers

Actually it is not impossible to get this splitting done. Consider this code:

$str = '<div>Hello<div>,(foo,bar),@func[opt1,opt2],{,test},blahblah';
$arr = preg_split('~([^,]*(?:{[^}]*}|\([^)]*\)|\[[^]]*])[^,]*)+|,~', $str, -1 , PREG_SPLIT_DELIM_CAPTURE | PREG_SPLIT_NO_EMPTY);
var_dump($arr);

OUTPUT:

array(5) {
  [0]=>
  string(15) "<div>Hello<div>"
  [1]=>
  string(9) "(foo,bar)"
  [2]=>
  string(16) "@func[opt1,opt2]"
  [3]=>
  string(7) "{,test}"
  [4]=>
  string(8) "blahblah"
}
like image 170
anubhava Avatar answered Sep 29 '22 00:09

anubhava


I don't think it can be done in a regular expression. The basic problem is that this requires variable length negative look-behinds (disallow any [({ that is not followed by a ])}), and that isn't a capability which RE currently has.

like image 22
Seth Robertson Avatar answered Sep 29 '22 01:09

Seth Robertson