Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

LL(*) versus PEG parsers : what is the difference?

Tags:

parsing

antlr

I am wondering if ANTLR v3, which presents its internal parsing algorithm as "LL(*)" is fully representative of PEG (parsing expression grammar) parsers.

Is there a difference ?

like image 581
JCLL Avatar asked Jan 11 '12 09:01

JCLL


3 Answers

The ANTLR article was wrong about PEG.

LL(*) is a subset of DCFG (Deterministic Context Free Grammar), which is a subset of CFG (Context Free Grammar).

PEG can express context sensitive grammars like A{n}B{n}C{n}, in which A and B and C all occur n times. Here's the definition:

s := &(x C) A+ y / ε
x := A x B / A B
y := B y C / B C

But there is no way to define such grammar in CFG (the proof involves pump lemma). So PEG is not subset CFG. Whether PEG is superset of CFG? I don't know.

Two key differences between LL(*) and PEG:

  1. LL(*) can only lookahead a DFA pattern, while PEG can lookahead a recursive pattern. For example, in PEG you can lookahead nested parens while LL(*) can't.

  2. The choice operator / in PEG is priority choice (or "possessive"), which means if you have rule A / AB, it never reaches the right hand side AB. In LL(*) for a rule A | AB, it is possible to match AB.

If you have a PEG grammar without a lookahead, or your lookahead pattern can be reduced into a DFA, then it can be translated into LL(*). If not, it is not possible.

like image 136
luikore Avatar answered Oct 01 '22 13:10

luikore


In ANTLR you can enable global backtracking on all production rules in your grammar, so that for k >= 1 you could parse pretty much the same as PEG's. Of course, because of all the potential backtracking, the run-time of the parser degrades. At the cost of (some) memory you could also enable memoization, making it behave like a Packrat-parser, able to parse the input in linear time.

So no, there's not much difference w.r.t ANTLR and PEG/Packrat (with the right options enabled!).

like image 25
Bart Kiers Avatar answered Oct 01 '22 13:10

Bart Kiers


ANTLR and PEG are not the same. It is pretty theoretical question, and I think it will be the best for you to refer to this paper wrote by Terrence Parr where he exactly points the differences between ANTLR and PEG and some advantages of ANTLR LL(*) parsing strategy. I don't want to give myself freedom to paraphrase what he wrote there, but it is better for you to read the whole paper.

like image 24
vldmrrdjcc Avatar answered Oct 01 '22 12:10

vldmrrdjcc