Consider the following grammar:
S → A | B
A → xy
B → xyz
This is what I think an LR(0) parser would do given the input xyz
:
| xyz → shift
x | yz → shift
xy | z → reduce
A | z → shift
Az | → fail
If my assumption is correct and we changed rule B
to read:
B → Az
now the grammar suddenly becomes acceptable by an LR(0) parser. I presume this new grammar describes the exact same set of strings than the first grammar in this question.
To further clarify:
I want to describe a language to a parser, without the structure of the grammar playing a role. I'd like to obtain the most minimal/fundamental description of a set of strings. For LR(k) grammars, I'd like to minimize the k.
I think your LR(0)
parser is not a standard parser:
An LR(0) parser is a shift/reduce parser that uses zero tokens of lookahead to determine what action to take (hence the 0). This means that in any configuration of the parser, the parser must have an unambiguous action to choose - either it shifts a specific symbol or applies a specific reduction. If there are ever two or more choices to make, the parser fails and we say that the grammar is not LR(0).
So, When you have:
S->A|B
A->xy
B->Az
or
S->A|B
A->xy
B->xyz
LR(0)
will never check B
rule, And for both of them it will fail.
State0 - Clousure(S->°A):
S->°A
A->°xy
Arcs:
0 --> x --> 2
0 --> A --> 1
-------------------------
State1 - Goto(State0,A):
S->A°
Arcs:
1 --> $ --> Accept
-------------------------
State2 - Goto(State0,x):
A->x°y
Arcs:
2 --> y --> 3
-------------------------
State3 - Goto(State2,y):
A->xy°
Arcs:
-------------------------
But if you have
I->S
S->A|B
A->xy
B->xyz or B->Az
Both of them will accept the xyz
, but in difference states:
State0 - Clousure(I->°S):
I->°S
S->°A
S->°B
A->°xy A->°xy, $z
B->°xyz B->°Az, $
Arcs:
0 --> x --> 4
0 --> S --> 1
0 --> A --> 2
0 --> B --> 3
-------------------------
State1 - Goto(State0,S):
I->S°
Arcs:
1 --> $ --> Accept
-------------------------
State2 - Goto(State0,A):
S->A° S->A°, $
B->A°z, $
Arcs: 2 --> z --> 5
-------------------------
State3 - Goto(State0,B):
S->B°
Arcs:
-------------------------
State4 - Goto(State0,x):
A->x°y A->x°y, $z
B->x°yz
Arcs:
4 --> y --> 5 4 --> y --> 6
-------------------------
State5 - Goto(State4,y): - Goto(State2,z):
A->xy° B->Az°, $
Arcs:
5 --> z --> 6 -<None>-
-------------------------
State6 - Goto(State5,z): - Goto(State4,y)
B->xyz° A->xy°, $z
Arcs:
-------------------------
You can see the Goto Table
and Action Table
is different.
[B->xyz] [B->Az]
| Stack | Input | Action | Stack | Input | Action
--+---------+--------+---------- --+---------+--------+----------
1 | 0 | xyz$ | Shift 1 | 0 | xyz$ | Shift
2 | 0 4 | yz$ | Shift 2 | 0 4 | xy$ | Shift
3 | 0 4 5 | z$ | Shift 3 | 0 4 6 | z$ | Reduce A->xy
4 | 0 4 5 6 | $ | Reduce B->xyz 4 | 0 2 | z$ | Shift
5 | 0 3 | $ | Reduce S->B 5 | 0 2 5 | $ | Reduce B->Az
6 | 0 1 | $ | Accept 6 | 0 3 | $ | Reduce S->B
7 | 0 1 | $ | Accept
Simply when you change B->xyz
to B->Az
you add an Action
to your LR Table
to find the differences you can check Action Table
and Goto Table
(Constructing LR(0) parsing tables)
When you have A->xy
and B->xyz
then you have two bottom handles [xy
or xyz
] but when you have B->Az
you have only one bottom handle [xy
] that can accept an additional z
.
I think related to local optimization -c=a+b; d=a+b -> c=a+b; d=c- when you use B->Az
you make B->xyz
optimized.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With