I have this ANTLR 4 grammar:
constantFixedExpresion : term (('+'|'-') term)+; term : factor (('*'|'//'|'REM')factor)+; factor : ('+'|'-')* ( wholeNumberConstant | constantFixedExpresion | 'TOFIXED' (stringConstant | bitCodeConstant) | identifier) ('FIT'constantFixedExpresion)*;
I get the following error:
error(119): LanguageA.g4::: The following sets of rules are mutually left-recursive [constantFixedExpresion, factor, term]
I tried so many ways but can't fix it. What is the problem and how can I solve it?
A Grammar G (V, T, P, S) is left recursive if it has a production in the form. A → A α |β. Left Recursion can be eliminated by introducing new non-terminal A such that. This type of recursion is also called Immediate Left Recursion.
Left recursion often poses problems for parsers, either because it leads them into infinite recursion (as in the case of most top-down parsers) or because they expect rules in a normal form that forbids it (as in the case of many bottom-up parsers, including the CYK algorithm).
6 Answers. Left recursion is a problem in top-down parsers because top down parsers use left-most derivation to derive the required string by using the start symbol of the grammar. Due to this reason top-down parsers might go into infinite loop with left recursive grammar.
Left recursion elimination is only necessary for LL parsers.
Antlr is a LL(*) parser, which is in many ways "better" than a LL(k) parser, but still has many of its disavantages. One of these being the fact it can't deal with left-recursion (in fact, the version 4 can deal with left-recursion within the same rule). What the error is saying is that you have a left-recursion of a grammar, a bane for the LL parsers.
This is caused by this construction in your grammar:
constantFixedExpression: term ...; term: factor ...; factor: ('+' | '-')* (constantFixedExpression | ...) ...;
Since the *
operator means 0 or more, I can instantiate it with 0, so the parser will do this: "try constantFixedExpression
, so it needs to try term
, so it needs to try factor
, so it needs to try constantFixedEXpression
, so it [...]" and you've got yourself an infinite loop.
Fortunately, context-free formal grammars have an equivalent transformation for removing left-recursion! It can be expressed generically by:
A -> Aa | b -- becomes -- A -> bR R -> aR | ε
Or in Antlr notation:
A: Aa | b; // becomes A: bR; R: (aR)?;
More information about this process can be found in automaton/grammars books or in the Wikipedia.
I'll leave correcting your grammar with the refactoration to remove left-recursion as your work. However, I want to touch in another point: Antlr 4 can do left-recursion! As I mentioned, the version 4 can deal with left-recursion within the same rule. There are ways to indicate precedence and associativity of operators other than directly in parsing, as you're doing, in Antlr4. Let's see how it works:
expr: NUMBER |<assoc=right> expr '^' expr | expr '*' expr | expr '/' expr | expr '+' expr | expr '-' expr;
This is an example of a basic calculator grammar. The operators at the top are those with highest precedence, and those at the bottom are of lower precedence. This means 2+2*3
will be parsed as 2+(2*3)
rather than (2+2)*3
. The <assoc=right>
construction means the operator in right-associative, so 1^2^3
will be parsed as 1^(2^3)
rather than (1^2)^3
.
As you can see, it is much easier to specify operators with left-recursion, so Antlr 4 is of big help in these moments! I recommend re-writing your grammar to make use of this feature.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With