Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Perl 5 - longest token matching in regexp (using alternation)

Tags:

regex

perl

Is possible to force a Perl 5 regexp match longest possible string, if the regexp is, for example:

a|aa|aaa

I found is probably default in perl 6, but in perl 5, how i can get this behavior?

EXAMPLE pattern:

[0-9]|[0-9][0-9]|[0-9][0-9][0-9][0-9]

If I have string 2.10.2014, then first match will be 2, which is ok; but the next match will be 1, and this is not ok because it should be 10. Then 2014 will be 4 subsequently matches 2,0,1,4, but it should be 2014 using [0-9][0-9][0-9][0-9]. I know I could use [0-9]+, but I can't.

like image 557
Krab Avatar asked Jan 22 '26 20:01

Krab


1 Answers

General solution: Put the longest one first.

my ($longest) = /(aaa|aa|a)/

Specific solution: Use

my ($longest) = /([0-9]{4}|[0-9]{1,2})/

If you can't edit the pattern, you'll have to find every possibility and find the longest of them.

my $longest;
while (/([0-9]|[0-9][0-9]|[0-9][0-9][0-9][0-9])/g) {
   $longest = $1 if length($1) > length($longest);
}
like image 140
ikegami Avatar answered Jan 24 '26 16:01

ikegami



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!