Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to cache and use the cached regexes in perl6 grammar?

Tags:

regex

raku

My code spends a lot of time on regex interpolation. As the patterns rarely change, I guess caching these generated regexes should speed up the code. But I cannot figure out a right way to cache and use the cached regexes.

The code is used to parse some arithmetric expressions. As the users are allowed to define new operators, the parser must be ready to add new operators to the grammar. So the parser use a table to record these new operators and generate regexes from the table on the fly.

#! /usr/bin/env perl6

use v6.c;

# the parser may add new operators to this table on the fly.
my %operator-table = %(
    1 => $['"+"', '"-"'],
    2 => $['"*"', '"/"'],
    # ...
);

# original code, runnable but slow.
grammar Operator {
    token operator(Int $level) {
        <{%operator-table{$level}.join('|')}>
    }

    # ...
}

# usage:
say Operator.parse(
    '+',
    rule => 'operator',
    args => \(1)
);
# output:
# 「+」

Here are some experiments:

# try to cache the generated regexes but not work.
grammar CachedOperator {
    my %cache-table = %();

    method operator(Int $level) {
        if (! %cache-table{$level}) {
            %cache-table.append(
                $level => rx { <{%operator-table{$level}.join('|')}> }
            )
        }

        %cache-table{$level}
    }
}

# test:
say CachedOperator.parse(
    '+',
    rule => 'operator',
    args => \(1)
);
# output:
# Nil
# one more try
grammar CachedOperator_ {
    my %cache-table = %();

    token operator(Int $level) {
        <create-operator($level)>
    }

    method create-operator(Int $level) {
        if (! %cache-table{$level}) {
            %cache-table.append(
                $level => rx { <{%operator-table{$level}.join('|')}> }
            )
        }

        %cache-table{$level}    
    }
}

# test:
say CachedOperator_.parse(
    '+',
    rule => 'operator',
    args => \(1)
);
# compile error:
# P6opaque: no such attribute '$!pos' on type Match in a Regex when trying to get a value
like image 777
lovetomato Avatar asked Jan 19 '19 12:01

lovetomato


1 Answers

The following doesn't directly answer your question but may be of interest.

User defined operators

The following code declares an operator in P6:

sub prefix:<op> ($operand) { " $operand prefixed by op" }

Now one can use the new operator:

say op 42; # 42 prefixed by op

A wide range of operator positions and arities are covered, including choice of associativity and precedence, parentheses for grouping, etc. So maybe this is an appropriate way to implement what you're implementing.

Although it's slow, it might be fast enough. Additionally, as Larry said in 2017 ...

we know some some places in the parser that are slower than they should be, for instance ... various lexers relook at various characters in your Perl 6 program, it averages 5 or 6 times on every character, which is obviously deeply sub-optimal, and we know how to fix it

... and with luck Jonathan will work on the P6 grammar parser this year.

DSLs and Slangs

Even if you aren't interested in using the main language's ability to declare user defined operators, or can't for some reason, the underlying mechanisms that make it work might be of interest/use. Here are some references:

  • Brian Duggan's Informal DSLs presentation (video, slides).

  • Mouq's 2014 gist Slangs.

  • Larry Wall's speculation from way back when in Switching parsers and Slangs.

like image 82
raiph Avatar answered Nov 20 '22 12:11

raiph