LALR Parser Generator Implementation Problem

Question

I'm currently trying to implement a LALR parser generator as described in "compilers principles techniques and tools" (also called "dragon book").

A lot already works. The parser generator is currently able to generate the full goto-graph.

Example Grammar:
                   S' --> S
                   S  --> C C
                   C  --> c C
                   C  --> d

Nonterminals: S', S, C
Terminals: c, d
Start: S'

The goto-graph:

I[0]---------------+      I[1]-------------+
| S' --> . S   , $ |--S-->| S' --> S . , $ |
| S  --> . C C , $ |      +----------------+
| C  --> . c C , c |
| C  --> . c C , d |      I[2]--------------+
| C  --> . d   , c |      | S --> C . C , $ |      I[3]--------------+
| C  --> . d   , d |--C-->| C --> . c C , $ |--C-->| S --> C C . , $ |
+------------------+      | C --> . d   , $ |      +-----------------+
   |  |                   +-----------------+
   |  |           +--c--+   |      |
   |  |           |     |   c      |
   |  |           |     v   v      |
   |  |     I[4]--------------+    |
   |  c     | C --> c . C , c |    |
   |  |     | C --> c . C , d |    |
   |  |     | C --> c . C , $ |    d
   |  |     | C --> . c C , c |    |
   |  +---->| C --> . c C , d |    |
   |        | C --> . c C , $ |    |
   d        | C --> . d   , c |--+ |
   |  +-----| C --> . d   , d |  | |
   |  |     | C --> . d   , $ |  | |
   |  |     +-----------------+  | |
   |  C                          | |
   |  |     I[6]--------------+  | |
   |  |     | C --> c C . , c |  d |
   |  +---->| C --> c C . , d |  | |
   |        | C --> c C . , $ |  | |
   |        +-----------------+  | |
   |                             | |
   |        I[5]------------+    | |
   |        | C --> d . , c |<---+ |
   +------->| C --> d . , d |      |
            | C --> d . , $ |<-----+
            +---------------+

I have trubbles with implementing the algorithm to generate the actions-table! My algorithm computes the following output:

state |    action      
      |  c  |  d  |  $   
------------------------
    0 |  s4 |  s5 |
------------------------
    1 |     |     | acc
------------------------
    2 |  s4 |  s5 |
------------------------
    3 |     |     |  r?
------------------------
    4 |  s4 |  s5 |
------------------------
    5 |  r? |  r? |  r?
------------------------
    6 |  r? |  r? |  r?

sx... shift to state x
rx... reduce to state x

The r? means that I don't know how to get the state (the ?) to which the parser should reduce. Does anyone know an algorithm to get ? using the goto-graph above?

If anything is describe no clearly enough, please ask and I will try to explain it better! Thanks for your help!

Gunther · Accepted Answer

A shift entry is attributed by the next state, but a reduce entry indicates a production.

When you shift, you push a state reference onto your stack and proceed to the next state.

When you reduce, this is for a specific production. The production was responsible for shifting n states onto your stack, where n is the number of symbols in that production. E.g. one for S', two for S, and two or one for C (i.e. for the first or second alternative for C).

After n entries are popped off the stack, you return to the state where you started processing that production. For that state and the nonterminal resulting from the production, you lookup the goto table to find the next state, which is then pushed.

So the reduce entries identify a production. In fact it may be sufficient to know the resulting nonterminal, and the number of symbols to pop.

Your table thus should read

state |    action       |  goto
      |  c  |  d  |  $  |  C  |  S   
------------------------------------
    0 |  s4 |  s5 |     |  2  |  1
------------------------------------
    1 |     |     | acc |     |
------------------------------------
    2 |  s4 |  s5 |     |  3  |
------------------------------------
    3 |     |     |  r0 |     |
------------------------------------
    4 |  s4 |  s5 |     |     |  6
------------------------------------
    5 |  r3 |  r3 |  r3 |     |
------------------------------------
    6 |  r2 |  r2 |  r2 |     |

where rx indicates reduce by production x.

Ken Bloom · Answer

You need to pop the stack and and find the next state from there.

LALR Parser Generator Implementation Problem

Tags:

parsing

context-free-grammar

lalr

parser-generator

raisyn

2 Answers

Gunther

Ken Bloom

Recent Activity

Donate For Us

LALR Parser Generator Implementation Problem

Tags:

parsing

context-free-grammar

lalr

parser-generator

raisyn

2 Answers

Gunther

Ken Bloom

Related questions

Recent Activity

Donate For Us