While finding follow sets, rules such as
A->aA can lead to infinite recursion. Is there any coding technique to avoid it?
Note that the above example is just an example, in practice such a recursion could happen indirectly as well.
Here is my sample C code for finding follow sets. The grammar is stored as an array of linked lists. Please tell me if the code is unclear at any point.
set findFollowSet(char nonTerminal[], Grammar G, hashTable2 h) //later assume that all first sets are already in the hashtable.
{
LINK temp1 = find2(h, nonTerminal);
set s= createEmptySet();
set temp = createEmptySet();
char lhs[80] = "\0";
int i;
//special case
if(temp1->numRightSideOf==0) //its not on right side of any grammar rule
return insert(s, "$");
for(i=0;i<temp1->numRightSideOf;i++)
{
link l = G.rules[temp1->rightSideOf[i]];
strcpy(lhs, l->symbol); //storing the lhs just in case the nonTerm appears on the rightmost end of the rule.
printf("!!!!! %s\n", lhs);
sleep(1);
//finding nonTerminal in G
while(l!=NULL)
{
if(strcmp(l->symbol, nonTerminal) == 0)
break;
l=l->next;
}
//found the nonTerminal in G
if(l->next!=NULL)
{
temp = findFirstSet(l->next, G, h);
temp = removeElement(temp, "EPSILON");
}
else //its on the rightmost end of the rule
temp = findFollowSet(lhs, G, h);
s = setUnion(s, temp); destroySet(temp);
}
return s;
}
FIRST and FOLLOW sets are defined recursively, so you need to find the recursive closure. What this mean in practice is that you don't find the FOLLOW set for a single non-terminal -- you find all the FOLLOW sets for all the terminals simultaneously, by starting with all sets empty and going over the grammar adding symbols to different sets, until no more symbols can be added to any set. So you end up with something like:
FOLLOW[*] = {}; // all follow sets start empty
done = false;
while (!done)
done = true;
for (R : each rule in the grammar)
A = RHS[R];
tmp = FOLLOW[A];
for (S : each symbol in LHS[R] from right to left)
if (S is terminal)
tmp = {S};
else
if (!(FOLLOW[S] contains tmp))
done = false
FOLLOW[S] |= tmp
if (epsilon in FIRST[S])
tmp |= FIRST[S] - epsilon
else
tmp = FIRST[S]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With