Suppose I have the following CFG. <pre class="prettyprint"><code>A -> B | Cx | EPSILON B -> C | yA C -> B | w | z </code></pre> Now if I try to find <pre class="prettyprint"><code>FIRST(C) = FIRST(B) U FIRST(w) U FIRST(z) = FIRST(C) U FIRST(yA) U {w, z} </code></pre> That is, I'm going in a loop. Thus I assume I have to convert it into a form which has immediate left recursion, which I can do as follows. <pre class="prettyprint"><code>A -> B | Cx | EPSILON B -> C | yA C -> C | yA | w | z </code></pre> Now if I try to calculate FIRST sets, I think I can get it done as follows. <pre class="prettyprint"><code>FIRST(C) = FIRST(C) U FIRST(yA) U FIRST(w) U FIRST(z) = { y, w, z } // I ignore FIRST(C) FIRST(B) = FIRST(C) U FIRST(yA) = { y, w, z } FIRST(A) = FIRST(B) U FIRST(Cx) U FIRST(EPSILON) = { y, w, z, EPSILON } </code></pre> Am I correct there? But even if I'm right there, I still run into a problem when I try to calculate FOLLOW sets from this grammar. <pre class="prettyprint"><code>FOLLOW(A) = { $ } U FOLLOW(B) U FOLLOW(C) </code></pre> I get FOLLOW(B) from 2nd rule and FOLLOW(C) from 3rd rule. But now to calculate FOLLOW(B), I need FOLLOW(A) (from 1st grammar rule) so again I'm stuck in a loop. Any help? Thanks in advance!

Since FIRST and FOLLOW are (normally) recursive, it's useful to think of them as systems of equations to be solved; the solution can be achieved using a simple incremental algorithm consisting of repeatedly applying all the right hand sides until no set has changed during a cycle. So let's take the FOLLOW relation for the given grammar: <pre class="prettyprint"><code>A → B | Cx | ε B → C | yA C → B | w | z </code></pre> We can directly derive the equations: <pre class="prettyprint"><code>FOLLOW(A) = FOLLOW(B) ∪ {$} FOLLOW(B) = FOLLOW(A) ∪ FOLLOW(C) FOLLOW(C) = FOLLOW(B) ∪ {x} </code></pre> So we initially set all the follow sets to {}, and proceed. First round: <pre class="prettyprint"><code>FOLLOW(A) = {} ∪ {$} = {$} FOLLOW(B) = {$} ∪ {} = {$} FOLLOW(C) = {$} U {x} = {$,x} </code></pre> Second round: <pre class="prettyprint"><code>FOLLOW(A) = {$} ∪ {$} = {$} FOLLOW(B) = {$} ∪ {$,x} = {$,x} FOLLOW(C) = {$,x} U {x} = {$,x} </code></pre> Third round: <pre class="prettyprint"><code>FOLLOW(A) = {$,x} ∪ {$} = {$,x} FOLLOW(B) = {$} ∪ {$,x} = {$,x} FOLLOW(C) = {$,x} U {x} = {$,x} </code></pre> Fourth round: <pre class="prettyprint"><code>FOLLOW(A) = {$,x} ∪ {$} = {$,x} FOLLOW(B) = {$,x} ∪ {$,x} = {$,x} FOLLOW(C) = {$,x} U {x} = {$,x} </code></pre> Here we stop because no changes were made in the last round. This algorithm must terminate because there are a finite number of symbols, and each round can only add symbols to steps. It is not the most efficient technique, although it is generally good enough in practice.

How to find FIRST and FOLLOW sets of a recursive grammar?

Tags:

compiler-construction

context-free-grammar

Suppose I have the following CFG.

A -> B | Cx | EPSILON
B -> C | yA
C -> B | w | z

Now if I try to find

FIRST(C) = FIRST(B) U FIRST(w) U FIRST(z)
         = FIRST(C) U FIRST(yA) U {w, z}

That is, I'm going in a loop. Thus I assume I have to convert it into a form which has immediate left recursion, which I can do as follows.

A -> B | Cx | EPSILON
B -> C | yA
C -> C | yA | w | z

Now if I try to calculate FIRST sets, I think I can get it done as follows.

FIRST(C) = FIRST(C) U FIRST(yA) U FIRST(w) U FIRST(z)
         = { y, w, z } // I ignore FIRST(C)
FIRST(B) = FIRST(C) U FIRST(yA)
         = { y, w, z }
FIRST(A) = FIRST(B) U FIRST(Cx) U FIRST(EPSILON)
         = { y, w, z, EPSILON }

Am I correct there?

But even if I'm right there, I still run into a problem when I try to calculate FOLLOW sets from this grammar.

FOLLOW(A) = { $ } U FOLLOW(B) U FOLLOW(C)

I get FOLLOW(B) from 2nd rule and FOLLOW(C) from 3rd rule. But now to calculate FOLLOW(B), I need FOLLOW(A) (from 1st grammar rule) so again I'm stuck in a loop.

Any help? Thanks in advance!

304

asked Mar 22 '15 17:03

Sach

Video Answer

1 Answers

Since FIRST and FOLLOW are (normally) recursive, it's useful to think of them as systems of equations to be solved; the solution can be achieved using a simple incremental algorithm consisting of repeatedly applying all the right hand sides until no set has changed during a cycle.

So let's take the FOLLOW relation for the given grammar:

A → B | Cx | ε
B → C | yA
C → B | w | z

We can directly derive the equations:

FOLLOW(A) = FOLLOW(B) ∪ {$}
FOLLOW(B) = FOLLOW(A) ∪ FOLLOW(C)
FOLLOW(C) = FOLLOW(B) ∪ {x}

So we initially set all the follow sets to {}, and proceed.

First round:

FOLLOW(A) = {} ∪ {$} = {$}
FOLLOW(B) = {$} ∪ {} = {$}
FOLLOW(C) = {$} U {x} = {$,x}

Second round:

FOLLOW(A) = {$} ∪ {$} = {$}
FOLLOW(B) = {$} ∪ {$,x} = {$,x}
FOLLOW(C) = {$,x} U {x} = {$,x}

Third round:

FOLLOW(A) = {$,x} ∪ {$} = {$,x}
FOLLOW(B) = {$} ∪ {$,x} = {$,x}
FOLLOW(C) = {$,x} U {x} = {$,x}

Fourth round:

FOLLOW(A) = {$,x} ∪ {$} = {$,x}
FOLLOW(B) = {$,x} ∪ {$,x} = {$,x}
FOLLOW(C) = {$,x} U {x} = {$,x}

Here we stop because no changes were made in the last round.

This algorithm must terminate because there are a finite number of symbols, and each round can only add symbols to steps. It is not the most efficient technique, although it is generally good enough in practice.

142

answered Sep 28 '22 16:09

rici

Related questions
                            
                                Why does this V8/Javascript code perform so badly?
                            
                                Strengths and weaknesses of JIT compilers for Python
                            
                                Why would C# allow an invalid enum value
                            
                                Is the c# compiler smarter than the VB.NET compiler?
                            
                                compiler memory barrier and mutex
                            
                                Synthesized vs Inherited Attributes
                            
                                Using visual studio c++ compiler in netbeans
                            
                                When can typeid return different type_info instances for same type?
                            
                                Is there a compiler memory barrier for a single variable?
                            
                                How does a linker know what all libraries to link?
                            
                                Is there a compiler as service for c++?
                            
                                Is it possible to debug code compiled at runtime?
                            
                                Is it possible to create C# language modifications as did LINQ?
                            
                                Is there a way to output the assembly of a single function in isolation?
                            
                                Difference between left/right recursive, left/right-most derivation, precedence, associativity etc
                            
                                C#: Declare that a function will never return null?
                            
                                Inlining a function with Clojure macros
                            
                                Porting compiler from x86 Assembly to LLVM
                            
                                How does the Java Runtime Environment compare with the .NET framework in terms of compilation process?
                            
                                Why is java both compiled and interpreted [duplicate]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With