Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Deterministic Context-Free Grammar versus Context-Free Grammar?

I'm reading my notes for my comparative languages class and I'm a bit confused...

What is the difference between a context-free grammar and a deterministic context-free grammar? I'm specifically reading about how parsers are O(n^3) for CFGs and compilers are O(n) for DCFGs, and don't really understand how the difference in time complexities could be that great (not to mention I'm still confused about what the characteristics that make a CFG a DCFG).

Thank you so much in advance!

like image 611
Laurence Avatar asked Oct 21 '22 12:10

Laurence


1 Answers

Conceptually they are quite simple to understand. The context free grammars are those which can be expressed in BNF. The DCFGs are the subset for which a workable parser can be written.

In writing compilers we are only interested in DCFGs. The reason is that 'deterministic' means roughly that the next rule to be applied at any point in the parse is determined by the input so far and a finite amount of lookahead. Knuth invented the LR() compiler back in the 1960s and proved it could handle any DCFG. Since then some refinements, especially LALR(1) and LL(1), have defined grammars that can be parsed in limited memory, and techniques by which we can write them.

We also have techniques to derive parsers automatically from the BNF, if we know it's one of these grammars. Yacc, Bison and ANTLR are familiar examples.

I've never seen a parser for a NDCFG, but at any point in the parse it would potentially need to consider the whole of the input string and every possible parse that could be applied. It's not hard to see why that would get rather large and slow.


I should point out that many real languages are imperfect, in that they are not entirely context free, not unambiguous or otherwise depart from the ideal DCFG. C/C++ is a good example, but there are many others. These languages are usually handled by special purpose rules such as semantic or syntactic predicates, special case backtracking or other 'tricks' with no effect on performance.


The comments point out that certain kinds of NDCFG are common and many tools provide a way to handle them. One common problem is ambiguity. It is relatively easy to parse an ambiguous grammar by introducing a simple local semantic rule, but of course this can only ever generate one of the possible parse trees. A generalised parser for NDCFG would potentially produce all parse trees, and could perhaps allow those trees to be filtered on some arbitrary condition. I don't know any of those.

Left recursion is not a feature of NDCFG. It presents a particular challenge to the design of LL() parsers but no problems for LR() parsers.

like image 120
david.pfx Avatar answered Oct 23 '22 04:10

david.pfx