Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is 'memoize' in PEG parsers (e.g. Pegasus) and when should it be used?

Tags:

peg

Here is an example from Pegasus:

additive <double> -memoize
= left:additive "+" right:multiplicative { left + right }
/ left:additive "-" right:multiplicative { left - right }
/ multiplicative

What is memoize in this context and when should I use it?

I understand the general concept (cache output for given inputs) — but what is "input" when we are talking about a PEG parser?

like image 855
Andrey Shchekin Avatar asked Sep 19 '25 20:09

Andrey Shchekin


1 Answers

I'm the author of Pegasus.

Pegasus will use the current location of the cursor along with the current version of the internal state as a key for caching the result of a specific rule, if that rule is set to memoize.

You should do this if the rule is likely to be called in the same state more than once.

For example, if you have this style of parser:

a = b "foo"
  / b "bar"
  / b "baz"

b = /* something expensive */

It would be worthwhile to memoize the b rule, since it is used as a prefix to several expressions.

Of course, this is optional, because in many cases this can be optimized in a better way:

a = b ("foo" / "bar" / "baz")

If you mark every rule with -memoize, this is basically the same thing as Packrat parsing. Pegasus allows you to control this selectively, since there is a sizeable overhead.

like image 95
John Gietzen Avatar answered Sep 23 '25 06:09

John Gietzen