Suppose I have the following CFG.
A -> B | Cx | EPSILON
B -> C | yA
C -> B | w | z
Now if I try to find
FIRST(C) = FIRST(B) U FIRST(w) U FIRST(z)
= FIRST(C) U FIRST(yA) U {w, z}
That is, I'm going in a loop. Thus I assume I have to convert it into a form which has immediate left recursion, which I can do as follows.
A -> B | Cx | EPSILON
B -> C | yA
C -> C | yA | w | z
Now if I try to calculate FIRST sets, I think I can get it done as follows.
FIRST(C) = FIRST(C) U FIRST(yA) U FIRST(w) U FIRST(z)
= { y, w, z } // I ignore FIRST(C)
FIRST(B) = FIRST(C) U FIRST(yA)
= { y, w, z }
FIRST(A) = FIRST(B) U FIRST(Cx) U FIRST(EPSILON)
= { y, w, z, EPSILON }
Am I correct there?
But even if I'm right there, I still run into a problem when I try to calculate FOLLOW sets from this grammar.
FOLLOW(A) = { $ } U FOLLOW(B) U FOLLOW(C)
I get FOLLOW(B) from 2nd rule and FOLLOW(C) from 3rd rule. But now to calculate FOLLOW(B), I need FOLLOW(A) (from 1st grammar rule) so again I'm stuck in a loop.
Any help? Thanks in advance!
For any production rule A → αBβ, If ∈ ∉ First(β), then Follow(B) = First(β) If ∈ ∈ First(β), then Follow(B) = { First(β) – ∈ } ∪ Follow(A)
A symbol c is in FIRST (α) if and only if α ⇒ cβ for some sequence β of grammar symbols. A terminal symbol a is in FOLLOW (N) if and only if there is a derivation from the start symbol S of the grammar such that S ⇒ αNαβ, where α and β are a (possible empty) sequence of grammar symbols.
Rules to compute FIRST set:FIRST(X) = FIRST(Y1) If FIRST(Y1) contains Є then FIRST(X) = { FIRST(Y1) – Є } U { FIRST(Y2) } If FIRST (Yi) contains Є for all i = 1 to n, then add Є to FIRST(X).
The conclusions is, we need to find FIRST and FOLLOW sets for a given grammar so that the parser can properly apply the needed rule at the correct position.
Since FIRST and FOLLOW are (normally) recursive, it's useful to think of them as systems of equations to be solved; the solution can be achieved using a simple incremental algorithm consisting of repeatedly applying all the right hand sides until no set has changed during a cycle.
So let's take the FOLLOW relation for the given grammar:
A → B | Cx | ε
B → C | yA
C → B | w | z
We can directly derive the equations:
FOLLOW(A) = FOLLOW(B) ∪ {$}
FOLLOW(B) = FOLLOW(A) ∪ FOLLOW(C)
FOLLOW(C) = FOLLOW(B) ∪ {x}
So we initially set all the follow sets to {}, and proceed.
First round:
FOLLOW(A) = {} ∪ {$} = {$}
FOLLOW(B) = {$} ∪ {} = {$}
FOLLOW(C) = {$} U {x} = {$,x}
Second round:
FOLLOW(A) = {$} ∪ {$} = {$}
FOLLOW(B) = {$} ∪ {$,x} = {$,x}
FOLLOW(C) = {$,x} U {x} = {$,x}
Third round:
FOLLOW(A) = {$,x} ∪ {$} = {$,x}
FOLLOW(B) = {$} ∪ {$,x} = {$,x}
FOLLOW(C) = {$,x} U {x} = {$,x}
Fourth round:
FOLLOW(A) = {$,x} ∪ {$} = {$,x}
FOLLOW(B) = {$,x} ∪ {$,x} = {$,x}
FOLLOW(C) = {$,x} U {x} = {$,x}
Here we stop because no changes were made in the last round.
This algorithm must terminate because there are a finite number of symbols, and each round can only add symbols to steps. It is not the most efficient technique, although it is generally good enough in practice.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With