Passing data to form grammar rules in Perl 6

Question

Not sure whether grammars are meant to do such things: I want tokens to be defined in runtime (in future — with data from a file). So I wrote a simple test code, and as expected it wouldn't even compile.

grammar Verb {
  token TOP {
    <root> 
    <ending>
  }
  token root {
    (\w+) <?{ ~$0 (elem) @root }>
  }
  token ending {
    (\w+) <?{ ~$0 (elem) @ending }>
  }
}

my @root = <go jump play>;
my @ending = <ing es s ed>;

my $string = "going";
my $match = Verb.parse($string);
.Str.say for $match<root>;

What's the best way of doing such things in Perl 6?

smls · Accepted Answer

To match any of the elements of an array, just write the name of the array variable (starting with a @ sigil) in the regex:

my @root = <go jump play>;
say "jumping" ~~ / @root /;        # Matches ｢jump｣
say "jumping" ~~ / @root 'ing' /;  # Matches ｢jumping｣

So in your use-case, the only tricky part is passing the arrays from the code that creates them (e.g. by parsing data files), to the grammar tokens that need them.

The easiest way would probably be to make them dynamic variables (signified by the * twigil):

grammar Verb {
    token TOP {
        <root> 
        <ending>
    }
    token root {
        @*root
    }
    token ending {
        @*ending
    }
}

my @*root = <go jump play>;
my @*ending = <ing es s ed>;

my $string = "going";
my $match = Verb.parse($string);

say $match<root>.Str;

Another way would be to pass a Capture with the arrays to the args adverb of method .parse, which will pass them on to token TOP, from where you can in turn pass them on to the sub-rules using the <foo(...)> or <foo: ...> syntax:

grammar Verb {
    token TOP (@known-roots, @known-endings) {
        <root: @known-roots>
        <ending: @known-endings>
    }
    token root (@known) {
        @known
    }
    token ending (@known) {
        @known
    }
}

my @root = <go jump play>;
my @ending = <ing es s ed>;

my $string = "going";
my $match = Verb.parse($string, args => \(@root, @ending));

say $match<root>.Str;  # go

raiph · Answer

The approach you were taking could have worked but you made three mistakes.

Scoping

Lexical variable declarations need to appear textually before the compiler encounters their use:

my $foo = 42; say $foo; # works
say $bar; my $bar = 42; # compile time error

Backtracking

say .parse: 'going' for

  grammar using-token              {token TOP {         \w+ ing}}, # Nil
  grammar using-regex-with-ratchet {regex TOP {:ratchet \w+ ing}}, # Nil
  grammar using-regex              {regex TOP {         \w+ ing}}; # ｢going｣

The regex declarator has exactly the same effect as the token declarator except that it defaults to doing backtracking.

Your first use of \w+ in the root token matches the entire input 'going', which then fails to match any element of @root. And then, because there's no backtracking, the overall parse immediately fails.

(Don't take this to mean that you should default to using regex. Relying on backtracking can massively slow down parsing and there's typically no need for it.)

Debugging

See https://stackoverflow.com/a/19640657/1077672

This works:

my @root = <go jump play>;
my @ending = <ing es s ed>;

grammar Verb {
  token TOP {
    <root> 
    <ending>
  }
  regex root {
    (\w+) <?{ ~$0 (elem) @root }>
  }
  token ending {
    (\w+) <?{ ~$0 (elem) @ending }>
  }
}

my $string = "going";
my $match = Verb.parse($string);

.Str.say for $match<root>;

outputs:

go

Passing data to form grammar rules in Perl 6

Tags:

raku

Eugene Barsky

Video Answer

2 Answers

smls

Scoping

Backtracking

Debugging

raiph

Recent Activity

Donate For Us

Passing data to form grammar rules in Perl 6

Tags:

raku

Eugene Barsky

Video Answer

2 Answers

smls

Scoping

Backtracking

Debugging

raiph

Related questions

Recent Activity

Donate For Us