Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ANTLR4 mutually left-recursive error when parsing

Tags:

I have this ANTLR 4 grammar:

constantFixedExpresion : term (('+'|'-') term)+;  term : factor (('*'|'//'|'REM')factor)+;  factor : ('+'|'-')*            ( wholeNumberConstant            | constantFixedExpresion            | 'TOFIXED' (stringConstant | bitCodeConstant)                  | identifier)          ('FIT'constantFixedExpresion)*; 

I get the following error:

error(119): LanguageA.g4::: The following sets of rules are mutually left-recursive [constantFixedExpresion, factor, term]

I tried so many ways but can't fix it. What is the problem and how can I solve it?

like image 415
Alican Avatar asked Oct 20 '14 06:10

Alican


People also ask

How do you fix left recursion?

A Grammar G (V, T, P, S) is left recursive if it has a production in the form. A → A α |β. Left Recursion can be eliminated by introducing new non-terminal A such that. This type of recursion is also called Immediate Left Recursion.

Is left recursion a problem for LL parsers?

Left recursion often poses problems for parsers, either because it leads them into infinite recursion (as in the case of most top-down parsers) or because they expect rules in a normal form that forbids it (as in the case of many bottom-up parsers, including the CYK algorithm).

Which parser has problem of left recursion?

6 Answers. Left recursion is a problem in top-down parsers because top down parsers use left-most derivation to derive the required string by using the start symbol of the grammar. Due to this reason top-down parsers might go into infinite loop with left recursive grammar.

Do we need to remove left recursion for SLR parser?

Left recursion elimination is only necessary for LL parsers.


1 Answers

Antlr is a LL(*) parser, which is in many ways "better" than a LL(k) parser, but still has many of its disavantages. One of these being the fact it can't deal with left-recursion (in fact, the version 4 can deal with left-recursion within the same rule). What the error is saying is that you have a left-recursion of a grammar, a bane for the LL parsers.

This is caused by this construction in your grammar:

constantFixedExpression: term ...; term: factor ...; factor: ('+' | '-')* (constantFixedExpression | ...) ...; 

Since the * operator means 0 or more, I can instantiate it with 0, so the parser will do this: "try constantFixedExpression, so it needs to try term, so it needs to try factor, so it needs to try constantFixedEXpression, so it [...]" and you've got yourself an infinite loop.


Fortunately, context-free formal grammars have an equivalent transformation for removing left-recursion! It can be expressed generically by:

A -> Aa | b -- becomes -- A -> bR R -> aR | ε 

Or in Antlr notation:

A: Aa | b; // becomes A: bR; R: (aR)?; 

More information about this process can be found in automaton/grammars books or in the Wikipedia.


I'll leave correcting your grammar with the refactoration to remove left-recursion as your work. However, I want to touch in another point: Antlr 4 can do left-recursion! As I mentioned, the version 4 can deal with left-recursion within the same rule. There are ways to indicate precedence and associativity of operators other than directly in parsing, as you're doing, in Antlr4. Let's see how it works:

expr: NUMBER       |<assoc=right> expr '^' expr       | expr '*' expr       | expr '/' expr       | expr '+' expr       | expr '-' expr; 

This is an example of a basic calculator grammar. The operators at the top are those with highest precedence, and those at the bottom are of lower precedence. This means 2+2*3 will be parsed as 2+(2*3) rather than (2+2)*3. The <assoc=right> construction means the operator in right-associative, so 1^2^3 will be parsed as 1^(2^3) rather than (1^2)^3.

As you can see, it is much easier to specify operators with left-recursion, so Antlr 4 is of big help in these moments! I recommend re-writing your grammar to make use of this feature.

like image 85
Mephy Avatar answered Oct 05 '22 03:10

Mephy