Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I capture match-groups of alternation of a regular expression with split?

Tags:

regex

perl

I have a string

my $foo = 'one#two#three!four#five#six';

from which I want to extract the parts that are seperated by either a # or a !. This is easy enough with split:

my @parts = split /#|!/, $foo;

An additional requirement is that I also need to capture the exclamation marks. So I tried

my @parts = split /#|(!)/, $foo;

This however returns either an undef value or the exclamation mark (which is also clearly stated in the specification of split).

So, I weed out the unwanted undef values with grep:

my @parts = grep { defined } split /#|(!)/, $foo;

This does what I want.

Yet I was wondering if I can change the regular expression in a way so that I don't have to also invoke grep.

like image 274
René Nyffenegger Avatar asked Mar 27 '17 09:03

René Nyffenegger


1 Answers

When you use split, you may not omit the empty captures once a match is found (as there are always as many captures in the match as there are defined in the regular expression). You may use a matching approach here, though:

my @parts = $foo =~ /[^!#]+|!/g;

This way, you will match 1 or more chars other than ! and # (with [^!#]+ alternative), or an exclamation mark, multiple times (/g).

like image 169
Wiktor Stribiżew Avatar answered Sep 30 '22 23:09

Wiktor Stribiżew