As is explained in Removing left recursion , there are two ways to remove the left recursion.
What people normally use for removing (not having) the left recursion with ANTLR? I've used flex/bison for parser, but I need to use ANTLR. The only thing I'm concerned about using ANTLR (or LL parser in genearal) is left recursion removal.
I did some experiment with the following grammar.
E -> E + T|T T -> T * F|F F -> INT | ( E )
After left recursion removal, I get the following one
E -> TE' E' -> null | + TE' T -> FT' T' -> null | * FT'
I could come up with the following ANTLR representation. Even though, It's relatively pretty simple and straightforward, it seems the grammar that doesn't have the left recursion should be the better way to go.
grammar T; options { language=Python; } start returns [value] : e {$value = $e.value}; e returns [value] : t ep { $value = $t.value if $ep.value != None: $value += $ep.value } ; ep returns [value] : {$value = None} | '+' t r = ep { $value = $t.value if $r.value != None: $value += $r.value } ; t returns [value] : f tp { $value = $f.value if $tp.value != None: $value *= $tp.value } ; tp returns [value] : {$value = None} | '*' f r = tp { $value = $f.value; if $r.value != None: $value *= $r.value } ; f returns [int value] : INT {$value = int($INT.text)} | '(' e ')' {$value = $e.value} ; INT : '0'..'9'+ ; WS: (' '|'\n'|'\r')+ {$channel=HIDDEN;} ;
This is only orthogonally relevant, but I just published a preprint of a paper on a new parsing method that I call "pika parsing" (c.f. packrat parsing) that directly handles left recursive grammars without the need for rule rewriting.
https://arxiv.org/abs/2005.06444
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With