Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Chomsky Hierarchy and LL(*) parsers

I want to parse a programming language. I read a lot about formal languages and the Chomsky hierarchy and ANTLR. But I could not find information on how to relate the languages ANTLR v3 as an LL(*) recursive descent parser accepts to the chomsky hierarchy.

How do the Chomsky types mix with LL(*)? Any information (online, books, papers) are greatly appreciated.

Edit: How do syntactic / semantic predicates and backtracking of ANTLR map into this?

like image 639
meelow Avatar asked Dec 06 '22 07:12

meelow


2 Answers

The Chomsky Hierarchy is basically:

  1. Regular languages
  2. Context-Free Grammars
  3. Context-Sensitive Grammars
  4. Recursively Enumerable (Turing-Complete) Grammars

LL Grammars (and parsers) are a subset of context-free grammars. They are used because regular languages are too weak for programming purposes and because a general context-free parser is O(n^3) which is too slow for parsing a program. Indeed, augmenting a parser with helper functions does make it stronger. The Wikipedia entry on LL parsers explains some of this.The Dragon Book is considered a leading textbook on compilers, and may explain further.

like image 152
Yuval F Avatar answered Jan 18 '23 15:01

Yuval F


LL(*) is a subset of context-free languages. However, a different question is what antlr can parse, given predicates and backtracking, which extend its abilities.

Note that if we talk about LL(*), that means ANTLR v3, not 2.

like image 29
jpalecek Avatar answered Jan 18 '23 14:01

jpalecek