Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I handle negative numbers in a PEG grammar?

I'm trying to write a simple int expression parser using tatsu, a PEG-based Python parser generator. Here is my code:

import tatsu

grammar = r'''
    start = expression $ ;
    expression = add | sub | term ;
    add = expression '+' term ;
    sub = expression '-' term ;
    term = mul | div | number ;
    mul = term '*' number ;
    div = term '/' number ;
    number = [ '-' ] /\d+/ ;
'''
parser = tatsu.compile(grammar)
print(parser.parse('2-1'))

The output of this program is ['-', '1'] instead of the expected ['2', '-', '1'].

I get the correct output if I either:

  • Remove support for unary minus, i.e. change the last rule to number = /\d+/ ;
  • Remove the term, mul and div rules, and support only addition and subtraction
  • Replace the second rule with expresssion = add | sub | mul | div | number ;

The last option actually works without leaving any feature out, but I don't understand why it works. What is going on?

EDIT: If I just flip the add/sub/mul/div rules to get rid of left recursion, it also works. But then evaluating the expressions becomes a problem, since the parse tree is flipped. (3-2-1 becomes 3-(2-1))

like image 262
itsadok Avatar asked Nov 19 '22 11:11

itsadok


1 Answers

There are left recursion cases that TatSu doesn't handle, and work on fixing that is currently on hold.

You can use left/right join/gather operators to control the associativity of parsed expressions in a non-left-recursive grammar.

like image 170
Apalala Avatar answered Dec 22 '22 13:12

Apalala