Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Difference between left/right recursive, left/right-most derivation, precedence, associativity etc

I am currently learning language processors and a topic that comes up very often is the direction in which elements in a grammar are consumed. Left to right or right to left. I understand the concept but there seems to be so many ways of writing these rules and I am not sure if they are all the same. What I've seen so far is:

Right/Left recursion, Right/Left-most derivation, Right/Left reduction, precedence, associativity etc.

Do these all mean the same thing?

like image 569
user3001845 Avatar asked Aug 20 '15 16:08

user3001845


People also ask

What is left-recursive and right recursive?

With right recursion, no reduction takes place until the entire list of elements has been read; with left recursion, a reduction takes place as each new list element is encountered. Left recursion can therefore save a lot of stack space.

How do you know if a grammar is left or right recursive?

A production for a non-terminal is recursive if it can derive a sequence containing that non-terminal; it is left-recursive if the non-terminal can appear at the start (left edge) of the derived sequence, and right-recursive if it can appear at the end (right edge).

What is leftmost derivation in compiler design?

If the sentential form of an input is scanned and replaced from left to right, it is called left-most derivation. The sentential form derived by the left-most derivation is called the left-sentential form.

What is left recursion with example?

A Grammar G (V, T, P, S) is left recursive if it has a production in the form. A → A α |β. Left Recursion can be eliminated by introducing new non-terminal A such that. This type of recursion is also called Immediate Left Recursion.


Video Answer


2 Answers

No, they all have different meanings.

Right- and left-recursion refer to recursion within production rules. A production for a non-terminal is recursive if it can derive a sequence containing that non-terminal; it is left-recursive if the non-terminal can appear at the start (left edge) of the derived sequence, and right-recursive if it can appear at the end (right edge). A production can be recursive without being either left- or right-recursive, and it can even be both left- and right-recursive.

For example:

term: term '*' factor            { /* left-recursive */ }
assignment: lval '=' assignment  { /* right-recursive */ }

The above examples are both direct recursion; the non-terminal directly derives a sequence containing the non-terminal. Recursion can also be indirect; it is still recursion.

All common parsing algorithm process left-to-right, which is the first L in LL and LR. Top-down (LL) parsing finds a leftmost derivation (the second L), while bottom-up (LR) parsing finds a rightmost derivation (the R).

Effectively, both types of parser start with a single non-terminal (the start symbol) and "guess" a derivation based on some non-terminal in the current sequence until the input text is derived. In a leftmost derivation, it is always the leftmost non-terminal which is expanded. In a rightmost derivation, it is always the rightmost non-terminal.

So a top-down parser always guesses which production to use for the first non-terminal, after which it needs to again work on whatever is now the first non-terminal. ("Guess" here is informal. It can look at the input to be matched -- or at least the next k tokens of the input -- in order to determine which production to use.) This is called top-down processing because it builds the parse tree from the top down.

It's easier (at least for me) to visualize the action of a bottom-up parser in reverse; it builds the parse tree bottom up by repeatedly reading just enough of the input to find some production, which will be the last derivation in the derivation chain. So it does produce a rightmost derivation, but it outputs it back-to-front.

In an LR grammar for an operator language (roughly speaking, a grammar for languages which look like arithmetic expressions), left- and right- associativity are modelled using left- and right-recursive grammar rules, respectively. "Associativity" is an informal description of the grammar, as is "precedence".

Precedence is modelled by using a series of grammar rules, each of which refers to the next rule (and which usually end up with a recursive production for handling parentheses -- '(' expr ')' -- which is neither left- nor right-recursive).

There is an older style of bottom-up parsing, called "operator precedence parsing", in which precedence is explicitly part of the language description. One common operator-precedence algorithm is the so-called Shunting Yard algorithm. But if you have an LALR(1) parser generator, like bison, you might as well use that instead, because it is both more general and more precise.

like image 118
rici Avatar answered Oct 29 '22 03:10

rici


(I am NOT an expert on parser and compiler theory. I happen to be learning something related. And I'd like to share something I have found so far.)

I strongly suggest taking a look at this awesome article.

It explains and illustrats the LL and LR algorithm. You can clearly see why LL is called top-down and LR is called bottom-up.

Some quotation:

The primary difference between how LL and LR parsers operate is that an LL parser outputs a pre-order traversal of the parse tree and an LR parser outputs a post-order traversal.

...

We are converging on a very simple model for how LL and LR parsers operate. Both read a stream of input tokens and output that same token stream, inserting rules in the appropriate places to achieve a pre-order (LL) or post-order (LR) traversal of the parse tree.

...

When you see designations like LL(1), LR(0), etc. the number in parentheses is the number of tokens of lookahead.

And as to the acronyms: (source)

The first L in LR and LL means: that the parser reads input text in one direction without backing up; that direction is typically Left to right within each line, and top to bottom across the lines of the full input file.

The remaining R and L means: right-most and left-most derivation, respectively.

These are 2 different parsing strategies. A parsing strategy determines the next non-terminal to rewrite. (source)

  • For left-most derivation, it is always the leftmost nonterminal.
  • For right-most derivation, it is always the rightmost nonterminal.
like image 37
smwikipedia Avatar answered Oct 29 '22 02:10

smwikipedia