Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to find FIRST and FOLLOW sets of a recursive grammar?

Suppose I have the following CFG.

A -> B | Cx | EPSILON
B -> C | yA
C -> B | w | z

Now if I try to find

FIRST(C) = FIRST(B) U FIRST(w) U FIRST(z)
         = FIRST(C) U FIRST(yA) U {w, z}

That is, I'm going in a loop. Thus I assume I have to convert it into a form which has immediate left recursion, which I can do as follows.

A -> B | Cx | EPSILON
B -> C | yA
C -> C | yA | w | z

Now if I try to calculate FIRST sets, I think I can get it done as follows.

FIRST(C) = FIRST(C) U FIRST(yA) U FIRST(w) U FIRST(z)
         = { y, w, z } // I ignore FIRST(C)
FIRST(B) = FIRST(C) U FIRST(yA)
         = { y, w, z }
FIRST(A) = FIRST(B) U FIRST(Cx) U FIRST(EPSILON)
         = { y, w, z, EPSILON }

Am I correct there?

But even if I'm right there, I still run into a problem when I try to calculate FOLLOW sets from this grammar.

FOLLOW(A) = { $ } U FOLLOW(B) U FOLLOW(C)

I get FOLLOW(B) from 2nd rule and FOLLOW(C) from 3rd rule. But now to calculate FOLLOW(B), I need FOLLOW(A) (from 1st grammar rule) so again I'm stuck in a loop.

Any help? Thanks in advance!

like image 304
Sach Avatar asked Mar 22 '15 17:03

Sach


People also ask

How do you calculate first and follow sets?

For any production rule A → αBβ, If ∈ ∉ First(β), then Follow(B) = First(β) If ∈ ∈ First(β), then Follow(B) = { First(β) – ∈ } ∪ Follow(A)

How do you find first and follow in compiler?

A symbol c is in FIRST (α) if and only if α ⇒ cβ for some sequence β of grammar symbols. A terminal symbol a is in FOLLOW (N) if and only if there is a derivation from the start symbol S of the grammar such that S ⇒ αNαβ, where α and β are a (possible empty) sequence of grammar symbols.

How do you find first set?

Rules to compute FIRST set:FIRST(X) = FIRST(Y1) If FIRST(Y1) contains Є then FIRST(X) = { FIRST(Y1) – Є } U { FIRST(Y2) } If FIRST (Yi) contains Є for all i = 1 to n, then add Є to FIRST(X).

Why do we calculate first and follow?

The conclusions is, we need to find FIRST and FOLLOW sets for a given grammar so that the parser can properly apply the needed rule at the correct position.


Video Answer


1 Answers

Since FIRST and FOLLOW are (normally) recursive, it's useful to think of them as systems of equations to be solved; the solution can be achieved using a simple incremental algorithm consisting of repeatedly applying all the right hand sides until no set has changed during a cycle.

So let's take the FOLLOW relation for the given grammar:

A → B | Cx | ε
B → C | yA
C → B | w | z

We can directly derive the equations:

FOLLOW(A) = FOLLOW(B) ∪ {$}
FOLLOW(B) = FOLLOW(A) ∪ FOLLOW(C)
FOLLOW(C) = FOLLOW(B) ∪ {x}

So we initially set all the follow sets to {}, and proceed.

First round:

FOLLOW(A) = {} ∪ {$} = {$}
FOLLOW(B) = {$} ∪ {} = {$}
FOLLOW(C) = {$} U {x} = {$,x}

Second round:

FOLLOW(A) = {$} ∪ {$} = {$}
FOLLOW(B) = {$} ∪ {$,x} = {$,x}
FOLLOW(C) = {$,x} U {x} = {$,x}

Third round:

FOLLOW(A) = {$,x} ∪ {$} = {$,x}
FOLLOW(B) = {$} ∪ {$,x} = {$,x}
FOLLOW(C) = {$,x} U {x} = {$,x}

Fourth round:

FOLLOW(A) = {$,x} ∪ {$} = {$,x}
FOLLOW(B) = {$,x} ∪ {$,x} = {$,x}
FOLLOW(C) = {$,x} U {x} = {$,x}

Here we stop because no changes were made in the last round.

This algorithm must terminate because there are a finite number of symbols, and each round can only add symbols to steps. It is not the most efficient technique, although it is generally good enough in practice.

like image 142
rici Avatar answered Sep 28 '22 16:09

rici