As part of my grammar I have:
rule EX1 { <EX2> ( '/' <EX2>)* }
In my actions class I have written:
method EX1($/) {
my @ex2s = map *.made, $/.<EX2>;
my $ex1 = @ex2s.join('|');
#say "EX1 making $ex1";
$/.make($ex1);
}
So basically I am just trying to join all the EX2
's together with a '|'
between them instead of a '/'
. However something is not right with my code, as it only picks up the first EX2
, not the subsequent ones. How do I find out what the optional ones are?
TL;DR Your action method would work if your rule
created the data structure the method is expecting. So we'll fix the rule
and leave the method alone.
Let's assume the EX1
rule is slotted into a working grammar; a string has been successfully parsed; the substring ex2/ex2/ex2
matched the EX1
rule; and we've displayed the corresponding part of the parse tree (by just say
ing the results of .parse
using the grammar):
EX1 => 「ex2/ex2/ex2」
EX2 => 「ex2」
0 => 「/ex2」
EX2 => 「ex2」
0 => 「/ex2」
EX2 => 「ex2」
Note the extraneous 0 =>
captures and how the second and third EX2
s are indented under them and indented relative to the first EX2
. That's the wrong nesting structure relative to your method's assumptions.
As Brad++ points out in their comment responding to the first version of this answer, you can simply switch from the construct that both groups and captures ((...)
) to the one that only groups ([...]
).
rule EX1 { <EX2> [ '/' <EX2>]* }
Now the corresponding parse tree fragment for the same input string as above is:
EX1 => 「ex2/ex2/ex2」
EX2 => 「ex2」
EX2 => 「ex2」
EX2 => 「ex2」
The 0
captures are gone and the EX2
s are now all siblings. For further discussion of when and why P6 nests captures the way it does, see jnthn's answer to Why/how ... capture groups?.
Your action method should now work -- for some inputs...
If Brad's solution works for some of the inputs you'd expect it to work for, but not all, part of the problem is likely how your rule
matches between <EX2>
and the /
character.
As Håkon++ points out in their answer, your rule
has spacing that probably doesn't do what you want.
If you don't intend the spacing in your pattern to be significant, then don't use a rule
. In a token
or regex
all spaces in a pattern (ignoring inside a string eg ' '
) is just to make your pattern more readable and isn't meaningful relative to any input string being matched. If in doubt, use a token
(or regex
) not a rule
:
token EX1 { <EX2> ( '/' <EX2>)* }
🡅 🡅 🡅 🡅 🡅 🡅
Spacing indicated with 🡅
is NOT significant. You could omit it or extend it and it'll make no difference to how the rule matches input. It's only for readability.
In contrast, the entire point of the rule
construct is that whitespace after each atom and each quantifier in a pattern is significant. Such spacing implicitly applies a (user overridable) boundary matching rule (by default a rule that allows whitespace and/or a transition between "word" and non-"word" characters) after the corresponding substring in the input.
In your EX1
rule, which I repeat below with exaggerated spacing to ensure clarity, some of the spacing is not significant, just as it isn't in a token
or regex
:
rule EX1 { <EX2> ( '/' <EX2>)* }
🡅 🡅 🡅
As before 🡅
indicates spacing that is NOT significant -- you can omit or extend it and it'll make no difference. The thing to remember is that spaces at the start of a pattern (or sub-pattern) is just for readability. (Experience from use showed that it was much better if any spacing there is not treated as significant.)
But spacing or lack of spacing after an atom or quantifier is significant:
This spacing is significant: ⮟ ⮟ ⮟
rule EX1 { <EX2> ( '/' <EX2>)* }
This LACK of spacing is significant: ⮝⮝
By writing your rule
as you did you're telling P6 to match input with boundary matching (which by default allows whitespace) only:
after the first <EX2>
(and thus before the first /
);
between /
and subsequent <EX2>
matches;
after the last <EX2>
match.
So your rule tells P6 to allow spaces between a /
and <EX2>
match when they occur in that order -- /
, then <EX2>
.
But it also tells P6 to not allow spaces the other way around -- between an <EX2>
match and a /
match in that order! Except with the very first <EX2> '/'
pair!! P6 will let you declare match patterns of arbitrary complexity, including spacing, but I doubt this is what you meant or want.
For a complete listing of what "after an atom" means (i.e. when whitespace in rule
s is significant) see When is white space really important in Perl6 grammars?.
This significant spacing feature is:
Classic Perl DWIMery designed to make life easier;
Idiomatic -- used in most grammars because it does indeed make life easier;
The only reason the rule
declarator exists (this significant whitespace aspect is the only difference between a rule
and a token
);
Completely optional because you can just use a token
instead.
If someone reading this thinks they'd rather not take advantage of this significant space feature, then they can just use token
s instead. (This in turn will likely lead them to see why rule
exists as an option, and then, or perhaps later, to see why it works the way it does, and to appreciate its DWIMery anew. :) )
Finally, here's the idiomatic way to write the pattern you're trying to match:
rule EX1 { <EX2> + % '/' }
This tells P6 to match one or more <EX2>
s separated by /
characters. See Modified quantifier: %
, %%
for an explanation of this nice construct.
This is still a rule
so most of the spacing in it remains significant. The precise details for when it is and isn't are at their most apparently fiddly for this construct because it has up to three significant spacers and one that's not:
NOT significant: ⮟ ⮟
rule EX1 { <EX2> + % '/' }
Significant: ⮝ ⮝ ⮝
Including spacing both before and after the +
is redundant:
rule EX1 { <EX2> + % '/' }
rule EX1 { <EX2> +% '/' } # same match result
rule EX1 { <EX2>+ % '/' } # same match result
White space is significant in rule
s. So I think you are missing a whitespace after the last <EX2>
:
rule EX1 { <EX2> ( '/' <EX2>)+ }
It should be:
rule EX1 { <EX2> ( '/' <EX2> )+ }
This allows for space to separate the terms in EX1
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With