I'm writing a parser in Parsec. Left-recursive productions like E -> E + E cannot be written easily in an LL parser, so Parsec provides buildExpressionParser
, which supports infix, postfix and prefix operators. But what about subscript operators?
How would E -> E [E] be implemented?
If I could consume the closing bracket without consuming the second expression, then I could emulate it with an Infix table entry for buildExpressionParser
. Thoughts?
Edit: I know there is an algorithm for left-recursion elimination that will most likely work for my grammar. I'm looking for something easy, or well-abstracted (such as buildExpressionParser
). Otherwise I'll just use Happy.
In our project we manually eliminated left recursion from the expressions. This works the following way:
We created a generic form of expressions, that is parameterized by the term parser:
expressionGen :: MParser (Expr LocInfo) -> MParser (Expr LocInfo)
expressionGen term = buildExpressionParser precedenceTable term <?> "expression"
where precedenceTable = -- unary, binary expressions
We have a normal term parser, and a term parser free of recursive rules. With that, we can parse (multiple) subscript operators:
term :: MParser (Expr LocInfo)
term = do indBase <- termNoArray
indexes <- many $ (brackets expression >>= return)
return -- semantics
<?> "term"
termNoArray :: MParser (Expr LocInfo)
termNoArray = -- normal terms
Finally, we have a topmost parser for expressions: expression = expressionGen term
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With