I'm matching a sequence of a repeating arbitrary character, with a minimum length, using a perl6 regex.
After reading through https://docs.perl6.org/language/regexes#Capture_numbers and tweaking the example given, I've come up with this code using an 'external variable':
#uses an additional variable $c
perl6 -e '$_="bbaaaaawer"; /((.){} :my $c=$0; ($c)**2..*)/ && print $0';
#Output: aaaaa
To aid in illustrating my question only, a similar regex in perl5:
#No additional variable needed
perl -e ' $_="bbaaaaawer"; /((.)\2{2,})/ && print $1';
Could someone enlighten me on the need/benefit of 'saving' $0
into $c
and the requirement of the empty {}
? Is there an alternative (better/golfed) perl6 regex that will match?
Thanks in advance.
Perl 6 regexes scale up to full grammars, which produce parse trees. Those parse trees are a tree of Match
objects. Each capture - named or positional - is either a Match
object or, if quantified, an array of Match
objects.
This is in general good, but does involve making the trade-off you have observed: once you are on the inside of a nested capturing element, then you are populating a new Match
object, with its own set of positional and named captures. For example, if we do:
say "abab" ~~ /((a)(b))+/
Then the result is:
「abab」
0 => 「ab」
0 => 「a」
1 => 「b」
0 => 「ab」
0 => 「a」
1 => 「b」
And we can then index:
say $0; # The array of the top-level capture, which was quantified
say $0[1]; # The second Match
say $0[1][0]; # The first Match within that Match object (the (a))
It is a departure from regex tradition, but also an important part of scaling up to larger parsing challenges.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With