Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Split on comma, but only when not in parenthesis

Tags:

split

perl

I am trying to do a split on a string with comma delimiter

my $string='ab,12,20100401,xyz(A,B)';
my @array=split(',',$string);

If I do a split as above the array will have values

ab
12
20100401
xyz(A,
B)

I need values as below.

ab
12
20100401
xyz(A,B) 

(should not split xyz(A,B) into 2 values) How do I do that?

like image 828
asha Avatar asked Feb 19 '11 06:02

asha


3 Answers

use Text::Balanced qw(extract_bracketed);
my $string = "ab,12,20100401,xyz(A,B(a,d))";
my @params = ();
while ($string) {
    if ($string =~ /^([^(]*?),/) {
        push @params, $1;
        $string =~ s/^\Q$1\E\s*,?\s*//;
    } else {
        my ($ext, $pre);
        ($ext, $string, $pre) = extract_bracketed($string,'()','[^()]+');
        push @params, "$pre$ext";
        $string =~ s/^\s*,\s*//;
    }
}

This one supports:

  • nested parentheses;
  • empty fields;
  • strings of any length.
like image 102
Alessandro Avatar answered Oct 12 '22 23:10

Alessandro


Here is one way that should work.

use Regexp::Common;

my $string = 'ab,12,20100401,xyz(A,B)';
my @array = ($string =~ /(?:$RE{balanced}{-parens=>'()'}|[^,])+/g);

Regexp::Common can be installed from CPAN.

There is a bug in this code, coming from the depths of Regexp::Common. Be warned that this will (unfortunately) fail to match the lack of space between ,,.

like image 39
btilly Avatar answered Oct 13 '22 00:10

btilly


Well, old question, but I just happened to wrestle with this all night, and the question was never marked answered, so in case anyone arrives here by Google as I did, here's what I finally got. It's a very short answer using only built-in PERL regex features:

my $string='ab,12,20100401,xyz(A,B)';
$string =~ s/((\((?>[^)(]*(?2)?)*\))|[^,()]*)(*SKIP),/$1\n/g;
my @array=split('\n',$string);

Commas that are not inside parentheses are changed to newlines and then the array is split on them. This will ignore commas inside any level of nested parentheses, as long as they're properly balanced with a matching number of open and close parens.

This assumes you won't have newline \n characters in the initial value of $string. If you need to, either temporarily replace them with something else before the substitution line and then use a loop to replace back after the split, or just pick a different delimiter to split the array on.

like image 34
John Smith Avatar answered Oct 13 '22 01:10

John Smith