Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Explanation about viable prefix

In Ullman's book of compilers, in shift reduce parsing, following definition of viable prefix is given :

"The set of prefixes of right sentential forms that can appear on the stack of a shift-reduce parser are called viable prefixes. An equivalent definition of a viable prefix is that it is a prefix of right sentential form that does not continue past the right end of the rightmost handle of that sentential form. By this definition, it is always possible to add terminal symbols to the end of a viable prefix to obtain a right sentential form. Therefore, there is apparently no error as long as the portion of the input seen to a given point can be reduced to a viable prefix."

I can't understand this definition. Could someone explain the meaning of viable prefix with an example ?
In particular, please explain the meaning of
"An equivalent definition of a viable prefix is that it is a prefix of right sentential form that does not continue past the right end of the rightmost handle of that sentential form"

like image 984
Happy Mittal Avatar asked Nov 17 '10 07:11

Happy Mittal


People also ask

What is bottom up parsing with example?

Bottom-up parsing can be defined as an attempt to reduce the input string w to the start symbol of grammar by tracing out the rightmost derivations of w in reverse. Eg. A general shift reduce parsing is LR parsing.

What is common prefix problem in compiler design?

Left FactoringIf more than one grammar production rules has a common prefix string, then the top-down parser cannot make a choice as to which of the production it should take to parse the string in hand.

What is a handle in LR parsing?

A handle of a string is a substring that matches the right side of a production, and whose reduction to the nonterminal on the left side of the production represents one step along the reverse of a rightmost derivation.

What is handle with example in compiler design?

A handle is a substring that connects a right-hand side of the production rule in the grammar and whose reduction to the non-terminal on the left-hand side of that grammar rule is a step along with the reverse of a rightmost derivation. It can scan the input string from left to right until first .> is encountered.


2 Answers

EDIT (or actually rewrite): The sentence you asked for a clarification of is a major hairball! I needed a bit of a refresher on language and automata myself to pick that hairball apart. I found these lecture notes very useful in that regard.

It also doesn't make it any easier that the terms are defined in terms of top-down expansion, while rightmost derivation is typically used with bottom-up parsing!

I'll use the following expression grammar to illustrate:

     expr -> expr + term | term     term -> term * factor | factor     factor -> NUMBER | ( expr ) 
  • A right-sentential form is a sentential form which can be reached by rightmost derivation, which is another way to describe repeated expansion of only the rightmost nonterminal symbol when proceeding top-down. This is a rightmost derivation, and all the forms in it are therefore right-sentential forms:
     expr -> expr + term          -> expr + term * factor          -> expr + term * NUMBER          -> expr + factor * NUMBER          -> expr + NUMBER * NUMBER          -> expr + term + NUMBER * NUMBER          -> expr + NUMBER + NUMBER * NUMBER          -> term + NUMBER + NUMBER * NUMBER          -> NUMBER + NUMBER + NUMBER * NUMBER 
  • A prefix of a sentential form (whether right or otherwise) is a sequence of input symbols that reduces to zero or more leading symbols of that sentential form. The empty sequence is trivially a prefix of every sentential form, and the complete sequence of symbols making up a sentential form is also trivially a prefix of it.

  • A simple phrase is the expansion of a single nonterminal symbol that holds a place in a sentential form. For example, term * factor is a simple phrase because it is an expansion of term, and term itself occurs in three productions.

  • The handle of a sentential form is the leftmost simple phrase within that form. (I admit, I find the term 'handle' somewhat confusing here.) In a rightmost derivation, the handle is easy to identify -- it's the sequence of symbols that resulted from the most recently expanded nonterminal. If you're working bottom-up the way a shift-reduce parser does, the handle is the simple phrase that needs to be reduced next. (Read the derivation table above from the bottom up, looking at which symbols were reduced, to see what I mean.)

  • A viable prefix of a right-sentential form is a prefix which does not extend beyond that form's handle -- in other words, that prefix is valid and contains no reducible simple phrases, with the possible exception of the handle itself if said prefix extends to exactly the end of the handle.

From a shift-reduce parser's point of view, as long as you have a viable prefix on the stack, you have not yet been forced to either reduce the (possibly incomplete) simple phrase on top of the stack to a new nonterminal or fail out of parsing if it can't be reduced. If shifting the next symbol would result in something other than a viable prefix, you must at that point either reduce or fail.

If you're parsing a context-free language, there is a rather convenient property that helps with the building of a table-driven shift-reduce parser: the set of all viable prefixes of a context-free language is itself a regular language! You can therefore build a finite automaton that recognizes the regular language of viable prefixes, and use it to determine when to shift and when to reduce. This combination of a stack and a finite state machine is essentially a push-down automaton, which is exactly the class of automaton needed to recognize a context-free language.

like image 190
Jeffrey Hantin Avatar answered Oct 14 '22 13:10

Jeffrey Hantin


Consider the grammar given in book(i'm restating it here)

E -> E+T | T T -> T*F | F  F -> (E) | id 

which is augmented by adding E' -> E in it

Now have a look at this derivation,

E' -> E    -> E+T    -> E+T*F 

Claim E+T* is a viable prefix

Argument: This derivation is a right sentential form & E+T* is a prefix of it. Handle currently is T*F (as reducing T*F to T we can reach the start symbol & hence a successful parse)

And hence, E+T* is a viable prefix as it is a prefix of right sentential form & doesn't extend past the rightmost handle for this sentential form. :)

Other way to define it is:

The prefixes of right sentential forms that can appear on the stack of a shiftreduce parser are called viable prefixes. 
like image 28
ak2 Avatar answered Oct 14 '22 13:10

ak2