I've been trying to figure out how to do a recursive regular expression in Perl 6. For a toy example, a balanced parentheses matcher, which would match ((())())
inside (((((())())
.
PCRE example: /\((?R)?\)/
Onigmo example: (?<paren>\(\g<paren>*\))
I thought this would do it:
my regex paren {
'(' ~ ')' <paren>*
}
or the simpler
my regex paren {
'(' <paren>* ')'
}
but that fails with
No such method 'paren' for invocant of type 'Match'
in regex paren at ...
You can use the ~~
in the meta-syntax to make a recursive callback into the current pattern or just a part of it. For example, you can match balanced parenthesis with the simple regex:
say "(()())" ~~ /'(' <~~>* ')'/; # 「(()())」
say "(()()" ~~ /'(' <~~>* ')'/; # 「()」
Try it online!
Unfortunately, matching via a captured subrule (like ~~0
) is not yet implemented.
You need to make explicit that you're calling a my
-scoped regex:
my regex paren {
'(' ~ ')' <&paren>*
}
Notice the &
that has been added. With that:
say "(()())" ~~ /^<&paren>$/ # 「(()())」
say "(()()" ~~ /^<&paren>$/ # Nil
While it's true that you can sometimes get away without explicitly writing the &
, and indeed could when using it:
say "(()())" ~~ /^<paren>$/ # 「(()())」
say "(()()" ~~ /^<paren>$/ # Nil
This only works because the compiler spots there is a regex defined in the lexical scope with the name paren
so compiles the <paren>
syntax into that. With the recursive case, the declaration isn't installed until after the regex is parsed, so one needs to be explicit.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With