Not sure whether grammars
are meant to do such things: I want tokens
to be defined in runtime (in future — with data from a file). So I wrote a simple test code, and as expected it wouldn't even compile.
grammar Verb {
token TOP {
<root>
<ending>
}
token root {
(\w+) <?{ ~$0 (elem) @root }>
}
token ending {
(\w+) <?{ ~$0 (elem) @ending }>
}
}
my @root = <go jump play>;
my @ending = <ing es s ed>;
my $string = "going";
my $match = Verb.parse($string);
.Str.say for $match<root>;
What's the best way of doing such things in Perl 6?
To match any of the elements of an array, just write the name of the array variable (starting with a @
sigil) in the regex:
my @root = <go jump play>;
say "jumping" ~~ / @root /; # Matches 「jump」
say "jumping" ~~ / @root 'ing' /; # Matches 「jumping」
So in your use-case, the only tricky part is passing the arrays from the code that creates them (e.g. by parsing data files), to the grammar tokens that need them.
The easiest way would probably be to make them dynamic variables (signified by the *
twigil):
grammar Verb {
token TOP {
<root>
<ending>
}
token root {
@*root
}
token ending {
@*ending
}
}
my @*root = <go jump play>;
my @*ending = <ing es s ed>;
my $string = "going";
my $match = Verb.parse($string);
say $match<root>.Str;
Another way would be to pass a Capture
with the arrays to the args
adverb of method .parse
, which will pass them on to token TOP
, from where you can in turn pass them on to the sub-rules using the <foo(...)>
or <foo: ...>
syntax:
grammar Verb {
token TOP (@known-roots, @known-endings) {
<root: @known-roots>
<ending: @known-endings>
}
token root (@known) {
@known
}
token ending (@known) {
@known
}
}
my @root = <go jump play>;
my @ending = <ing es s ed>;
my $string = "going";
my $match = Verb.parse($string, args => \(@root, @ending));
say $match<root>.Str; # go
The approach you were taking could have worked but you made three mistakes.
Lexical variable declarations need to appear textually before the compiler encounters their use:
my $foo = 42; say $foo; # works
say $bar; my $bar = 42; # compile time error
say .parse: 'going' for
grammar using-token {token TOP { \w+ ing}}, # Nil
grammar using-regex-with-ratchet {regex TOP {:ratchet \w+ ing}}, # Nil
grammar using-regex {regex TOP { \w+ ing}}; # 「going」
The regex
declarator has exactly the same effect as the token
declarator except that it defaults to doing backtracking.
Your first use of \w+
in the root
token matches the entire input 'going'
, which then fails to match any element of @root
. And then, because there's no backtracking, the overall parse immediately fails.
(Don't take this to mean that you should default to using regex
. Relying on backtracking can massively slow down parsing and there's typically no need for it.)
See https://stackoverflow.com/a/19640657/1077672
This works:
my @root = <go jump play>;
my @ending = <ing es s ed>;
grammar Verb {
token TOP {
<root>
<ending>
}
regex root {
(\w+) <?{ ~$0 (elem) @root }>
}
token ending {
(\w+) <?{ ~$0 (elem) @ending }>
}
}
my $string = "going";
my $match = Verb.parse($string);
.Str.say for $match<root>;
outputs:
go
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With