Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Passing variables in proto regex with Perl 6 grammar

Tags:

grammar

raku

Passing variables to token or regex or rule is fairly straightfoward. For example, the output of

grammar Foo {
    token TOP     { (.) {} <bar($0)> }
    token bar($s) { {say ~$s} .+ }
}
Foo.parse("xyz")

is simply x. But things get awry when using proto. For example,1 let's make a simple proto to discriminate the rest of the string being alpha or numeric:

grammar Foo {
    token TOP     { (.) {} <bar($0)> }
    proto token bar { * }
          token bar:sym<a> ($s) { {say ~$s} <alpha>+ }
          token bar:sym<1> ($s) { {say ~$s} <digit>+ }
}

Foo.parse("xyz")

This bombs, claiming that it expected 1 argument but got 2 for bar. Okay, well in normal methods, we have to specify the args in the proto declaration, so let's just declare that:

grammar Foo {
    token TOP     { (.) {} <bar($0)> }
    proto token bar ($s) { * }
          token bar:sym<a> ($s) { {say ~$s} <alpha>+ }
          token bar:sym<1> ($s) { {say ~$s} <digit>+ }
}

Foo.parse("xyz")

Now we get the opposite: expected 2 arguments but got 1. Hm, maybe that means the proto declaration is eating the value and not passing anything along. So I tried just slipping it in:

grammar Foo {
    token TOP     { (.) {} <bar($0)> }
    proto token bar (|) { * }
          token bar:sym<a> ($s) { {say ~$s} <alpha>+ }
          token bar:sym<1> ($s) { {say ~$s} <digit>+ }
}

Foo.parse("xyz")

Same error here. It claims it expected 2 arguments, but got 1.2 Somehow the use of proto is eating up the arguments. Currently, the only solution that I've found uses dynamic variables, which makes me think that there may be some hidden step where the variable isn't being passed from proto to candidate.

grammar Foo {
    token TOP     { (.) {} <bar($0)> }
    proto token bar ($*s) { * }
          token bar:sym<a> { {say ~$*s} <alpha>+ }
          token bar:sym<1> { {say ~$*s} <digit>+ }
}

Foo.parse("xyz")

But this seems like a not-entirely intuitive step. How would one pass the variable directly in a non-dynamic fashion to a proto such that it is received by the candidate?


[1] Note that all of the above code has been golfed to focus on passing variables. The actual tokens used bear no resemblance to my real code.
[2] I'm starting to wonder too if this is (generally speaking) a LTA error message. While I get that it's based on first arg = invocant, it still feels off. Maybe it should say "Expected invocant and one argument, only received invocant" or somesuch.

like image 739
user0721090601 Avatar asked Jul 28 '19 20:07

user0721090601


2 Answers

TL;DR

  • It's a bug. See [BUG] Proto regex with params isn't called correctly (possibly NYI) in Rakudo.

  • I have an alternative approach that works for passing non-dynamic arguments. But see the next point.

  • Your follow up commentary explaining what you'd golf'd from suggests your dynamic variable alternative might be better. I'll discuss that too.

An alternative approach that works

Switch proto token... to proto method... and token foo:sym<...>s to multi tokens without the :sym<...> affix:

grammar Foo {
  token TOP { (.) {} <bar($0)> }
  proto method bar ($s) {*}
  multi token bar ($s where /<alpha>+/) { {say 'alpha start ', $s} .. }
  multi token bar ($s where /<digit>+/) { {say 'digit start ', $s} .. }
}

say Foo.parse("xyz")

displays:

alpha start 「x」
「xyz」
 0 => 「x」
 bar => 「yz」

Your dynamic variable alternative might be better

In my actual code, the variable is passed along to block certain matches (mainly to avoid certain types of recursion)

It sounds like you could have a single dynamic variable (say $*nope), set to whatever value you wish, and systematically use that. Or perhaps a couple. Dynamic variables are intended for exactly this sort of thing. Beyond an ideological discomfit with dynamic variables (to the degree they're carelessly used as unconstrained globals they are bad news), what's not to like?

like image 147
raiph Avatar answered Oct 15 '22 16:10

raiph


The first thing is that I don't really get what you intend to do here. My impression is that you want the second part of the token to be a function of the first part. I don't get why you use a proto here. You can do that straight away this way:

grammar Foo {
    token TOP     { (.) {} <bar($0)> }
    token bar( $s )  { {say ~$s} $s <alpha>+ }
}

say Foo.parse("xxz")

But I'm not sure you can actually make it work combining syms and arguments. syms already have one argument: the one used in the adverb. It's more than simply a symbol, it's what is going to be matched there (if you use the predefined token <sym>; you can simply use it as sub-matches too:

grammar Foo {
    token TOP     { (.) {} <bar> }
    proto token bar {*}
    token bar:sym<alpha>  { <alpha>+ }
    token bar:sym<digit>  { <digit>+ }
}

say Foo.parse("xxz");
say Foo.parse("x01")

Or simply use the string as a sym match:

grammar Foo {
    token TOP     { (.) {} <bar>+ }
    proto token bar {*}
    token bar:sym<x>  { <sym>+ }
    token bar:sym<z>  { <sym>+ }
    token bar:sym<1>  { <sym>+ }
    token bar:sym<0>  { <sym>+ }
}

say Foo.parse("xxz");
say Foo.parse("x01")

So I would say that Raiph's answer is where you want to go; syms do not seem like the right way to achieve that, since they have a built-in variable (the argument to sym), but you have to specify every single case.

like image 33
jjmerelo Avatar answered Oct 15 '22 15:10

jjmerelo