Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Passing data to form grammar rules in Perl 6

Tags:

raku

Not sure whether grammars are meant to do such things: I want tokens to be defined in runtime (in future — with data from a file). So I wrote a simple test code, and as expected it wouldn't even compile.

grammar Verb {
  token TOP {
    <root> 
    <ending>
  }
  token root {
    (\w+) <?{ ~$0 (elem) @root }>
  }
  token ending {
    (\w+) <?{ ~$0 (elem) @ending }>
  }
}

my @root = <go jump play>;
my @ending = <ing es s ed>;

my $string = "going";
my $match = Verb.parse($string);
.Str.say for $match<root>;

What's the best way of doing such things in Perl 6?

like image 680
Eugene Barsky Avatar asked Oct 21 '17 08:10

Eugene Barsky


Video Answer


2 Answers

To match any of the elements of an array, just write the name of the array variable (starting with a @ sigil) in the regex:

my @root = <go jump play>;
say "jumping" ~~ / @root /;        # Matches 「jump」
say "jumping" ~~ / @root 'ing' /;  # Matches 「jumping」

So in your use-case, the only tricky part is passing the arrays from the code that creates them (e.g. by parsing data files), to the grammar tokens that need them.

The easiest way would probably be to make them dynamic variables (signified by the * twigil):

grammar Verb {
    token TOP {
        <root> 
        <ending>
    }
    token root {
        @*root
    }
    token ending {
        @*ending
    }
}

my @*root = <go jump play>;
my @*ending = <ing es s ed>;

my $string = "going";
my $match = Verb.parse($string);

say $match<root>.Str;

Another way would be to pass a Capture with the arrays to the args adverb of method .parse, which will pass them on to token TOP, from where you can in turn pass them on to the sub-rules using the <foo(...)> or <foo: ...> syntax:

grammar Verb {
    token TOP (@known-roots, @known-endings) {
        <root: @known-roots>
        <ending: @known-endings>
    }
    token root (@known) {
        @known
    }
    token ending (@known) {
        @known
    }
}

my @root = <go jump play>;
my @ending = <ing es s ed>;

my $string = "going";
my $match = Verb.parse($string, args => \(@root, @ending));

say $match<root>.Str;  # go
like image 165
smls Avatar answered Sep 29 '22 15:09

smls


The approach you were taking could have worked but you made three mistakes.

Scoping

Lexical variable declarations need to appear textually before the compiler encounters their use:

my $foo = 42; say $foo; # works
say $bar; my $bar = 42; # compile time error

Backtracking

say .parse: 'going' for

  grammar using-token              {token TOP {         \w+ ing}}, # Nil
  grammar using-regex-with-ratchet {regex TOP {:ratchet \w+ ing}}, # Nil
  grammar using-regex              {regex TOP {         \w+ ing}}; # 「going」

The regex declarator has exactly the same effect as the token declarator except that it defaults to doing backtracking.

Your first use of \w+ in the root token matches the entire input 'going', which then fails to match any element of @root. And then, because there's no backtracking, the overall parse immediately fails.

(Don't take this to mean that you should default to using regex. Relying on backtracking can massively slow down parsing and there's typically no need for it.)

Debugging

See https://stackoverflow.com/a/19640657/1077672


This works:

my @root = <go jump play>;
my @ending = <ing es s ed>;

grammar Verb {
  token TOP {
    <root> 
    <ending>
  }
  regex root {
    (\w+) <?{ ~$0 (elem) @root }>
  }
  token ending {
    (\w+) <?{ ~$0 (elem) @ending }>
  }
}

my $string = "going";
my $match = Verb.parse($string);

.Str.say for $match<root>;

outputs:

go
like image 45
raiph Avatar answered Sep 29 '22 15:09

raiph