I'm using Jison to write a parser. This is my grammar:
{
"program": [
["statements EOF", "return $1;"]
],
"statements": [
["statement", "$$ = $1;"],
["statements statement", "$$ = $1 + '\\n' + $2;"]
],
"statement": [
["expression NEWLINE", "$$ = $1 + ';';"]
],
"expression": [
["NUMBER", "$$ = yytext;"],
["expression expression", "$$ = $1 + ', ' + $2;"]
]
}
When I run it however I get the following error message:
Conflict in grammar: multiple actions possible when lookahead token is NUMBER in
state 9
- reduce by rule: expression -> expression expression
- shift token (then go to state 5)
States with conflicts:
State 9
expression -> expression expression . #lookaheads= NEWLINE NUMBER
expression -> expression .expression
expression -> .NUMBER
expression -> .expression expression
What am I supposed to make of this debug message? How would you explain this message in simple English? What does the period in expression -> expression expression .
mean? What are .expression
and .NUMBER
? How are they different from expression
and NUMBER
respectively?
What am I supposed to make of this debug message?
A grammar conflict means the parser can reach a state where it could follow multiple rules, but it doesn't have enough information to determine which one to follow (or worse, the grammar is ambiguous). You'll have to tweak the grammar to eliminate the conflicts. Often, this just takes practice to get right.
What does the period in expression -> expression expression . mean?
The period represents the position of the parser. So, in that rule, the parser would have just parsed two expressions, and is now in state 9. When the period is at the end of the rule, that means the rule can be "reduced", and grouped into a single expression
non-terminal in this case. However, it can only reduce if the next token (the lookahead) is a NEWLINE
or a NUMBER
.
In expression -> .NUMBER
, the parser has just encountered a NUMBER token, which it can "shift", then move to a new state.
The conflict occurs because the parser can reduce or shift when it encounters a NUMBER
token.
Edit: To resolve your conflict, we need to split that expression rule into distinct non-terminals. Having the same non-terminal in sequence is bound to produce conflicts.
e.g.
{
"program": [
["statements EOF", "return $1;"]
],
"statements": [
["statement", "$$ = $1;"],
["statements statement", "$$ = $1 + '\\n' + $2;"]
],
"statement": [
["expression NEWLINE", "$$ = $1 + ';';"]
],
"expression": [
["expression expression_base", "$$ = $1 + ', ' + $2;"],
["expression_base", "$$ = $1;"]
],
"expression_base": [
["NUMBER", "$$ = yytext;"]
]
}
Here's a nice resource for more background on these types of grammars.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With