Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Practical consequences of formal grammar power?

Tags:

parsing

theory

Every undergraduate Intro to Compilers course reviews the commonly-implemented subsets of context-free grammars: LL(k), SLR(k), LALR(k), LR(k). We are also taught that for any given k, each of those grammars is a subset of the next.

What I've never seen is an explanation of what sorts of programming language syntactic features might require moving to a different language class. There's an obvious practical motivation for GLR parsers, namely, avoiding an unholy commingling of parser and symbol table when parsing C++. But what about the differences between the two "standard" classes, LL and LR?

Two questions:

  1. What (general) syntactic constructions can be parsed with LR(k) but not LL(k')?
  2. In what ways, if any, do those constructions manifest as desirable language constructs?

There's a plausible argument for reducing language power by making k as small as possible, because a language requiring many, many tokens of lookahead will be harder for humans to parse, as well as "harder" for machines to parse. Question (2) implicitly asks if the same reasoning ends up holding between classes, as well as within a class.


edit: Here's one example to illustrate the sorts of answers I'm looking for, but for regular languages instead of context-free:

When describing a regular language, one usually gets three operators: +, *, and ?. Now, you can remove + without reducing the power of the language; instead of writing x+, you write xx*, and the effect is the same. But if x is some big and hairy expression, the two xs are likely to diverge over time due to human forgetfulness, yielding a syntactically correct regular expression that doesn't match the original author's intent. Thus, even though adding + doesn't strictly add power, it does make the notation less error-prone.

Are there constructs with similar practical (human?) effects that must be "removed" when switching from LR to LL?

like image 752
Ben Karel Avatar asked Dec 16 '09 01:12

Ben Karel


People also ask

What is the effect of grammar on your writing?

Grammar does play a vital role in creative writing. Proper grammar is necessary for credibility, readability, communication, and clarity. Mastering grammar will allow you as a writer to make your work clearer and more readable; you will also have the freedom of making stylistic choices.

Why is it important to study the correct usage of grammar?

Using incorrect grammar can lead to sentences being meaningless and the message unclear, which in turn can lead to misinterpretation by a communication partner. Using correct grammar makes listening and reading easier for others to understand and can make the communication process more enjoyable.

How important is English grammar in real life?

"Grammar is important because it is the language that makes it possible for us to talk about language. Grammar names the types of words and word groups that make up sentences not only in English but in any language. As human beings, we can put sentences together even as children—we can all do grammar.

What do you mean by formal grammar?

A formal grammar is defined as a set of production rules for such strings in a formal language. Formal language theory, the discipline that studies formal grammars and languages, is a branch of applied mathematics.


1 Answers

Parsing (I claim) is a bit like sorting: a problem that was the focus of a lot of thought in the early days of CS, leading to a set of well-understood solutions with some nice theoretical results.

My claim is that the picture that we get (or give, for those of us who teach) in a compilers class is, to some degree, a beautiful answer to the wrong question.

To answer your question more directly, an LL(1) grammar can't parse all kinds of things that you might want to parse; the "natural" formulation of an 'if' with an optional 'else', for instance.

But wait! Can't I reformulate my grammar as an LL(1) grammar and then patch up the source tree by walking over it afterward? Sure you can! To some degree, this is what makes the question of what kind of grammar your parser uses largely moot.

Also, back when I was an undergraduate (1990-94), whitespace-sensitive grammars were clearly the work of the Devil; now, Python and Haskell's designs are bringing whitespace-sensitivity back into the light. Also, Packrat parsing says "to heck with your theoretical purity: I'm just going to define a parser as a set of rules, and I don't care what class my grammar belongs to." (paraphrased)

In summary, I would agree with what I believe to be your implied suggestion: in 2009, a clear understanding of the difference between the classes LL(k) and LR(k) is less important in itself than the ability to formulate and debug a grammar that makes your parser generator happy.

like image 126
John Clements Avatar answered Sep 28 '22 19:09

John Clements