Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Perl6 grammars: match full line

Tags:

grammar

raku

I've just started exploring perl6 grammars. How can I make up a token "line" that matches everything between the beginning of a line and its end? I've tried the following without success:

my $txt = q:to/EOS/;
    row 1
    row 2
    row 3
    EOS


grammar sample {
    token TOP {
        <line>
    }

    token line {
        ^^.*$$
    }
}

my $match = sample.parse($txt);

say $match<line>[0];
like image 893
pistacchio Avatar asked Dec 29 '15 07:12

pistacchio


3 Answers

I can see 2 problem in your Grammar here, the first one here is the token line, ^^ and $$ are anchor to start and end of line, howeve you can have new line in between. To illustrate, let's just use a simple regex, without Grammar first:

my $txt = q:to/EOS/;
    row 1
    row 2
    row 3
    EOS

if $txt ~~ m/^^.*$$/ {
    say "match";
    say $/;
}

Running that, the output is:

match
「row 1
row 2
row 3」

You see that the regex match more that what is desired, however the first problem is not there, it is because of ratcheting, matching with a token will not work:

my $txt = q:to/EOS/;
    row 1
    row 2
    row 3
    EOS

my regex r {^^.*$$};
if $txt ~~ &r {
    say "match regex";
    say $/;
} else {
    say "does not match regex";
}
my token t {^^.*$$};
if $txt ~~ &t {
    say "match token";
    say $/;
} else {
    say "does not match token";
}

Running that, the output is:

match regex
「row 1
row 2
row 3」
does not match token

I am not really sure why, but token and anchor $$ does not seems to work well together. But what you want instead is searching for everything except a newline, which is \N* The following grammar solve mostly your issue:

grammar sample {
    token TOP {<line>}
    token line {\N+}
}

However it only matches the first occurence, as you search for only one line, what you might want to do is searching for a line + an optional vertical whitespace (In your case, you have a new line at the end of your string, but i guess you would like to take the last line even if there is no new line at the end ), repeated several times:

my $txt = q:to/EOS/;
    row 1
    row 2
    row 3
    EOS

grammar sample {
    token TOP {[<line>\v?]*}
    token line {\N+}
}

my $match = sample.parse($txt);
for $match<line> -> $l {
    say $l;
}

Output of that script begin:

「row 1」
「row 2」
「row 3」

Also to help you using and debugging Grammar, 2 really usefull modules : Grammar::Tracer and Grammar::Debugger . Just include them at the beginning of the script. Tracer show a colorful tree of the matching done by your Grammar. Debugger allows you to see it matching step by step in real time.

like image 67
Pierre VIGIER Avatar answered Nov 11 '22 12:11

Pierre VIGIER


Your original aproach can be made to work via

grammar sample {
    token TOP { <line>+ %% \n }
    token line { ^^ .*? $$ }
}

Personally, I would not try to anchor line and use \N instead as already suggested.

like image 24
Christoph Avatar answered Nov 11 '22 13:11

Christoph


my $txt = q:to/EOS/;
row 1
row 2
row 3
EOS


grammar sample {
    token TOP {
        <line>+
    }
    token line {
        \N+ \n
    }
}

my $match = sample.parse($txt);

say $match<line>[0];

Or if you can be specific about the line:

grammar sample {
    token TOP {
        <line>+
    }
    rule line {
        \w+ \d
    }
}
like image 27
CIAvash Avatar answered Nov 11 '22 12:11

CIAvash