Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

On complexity of recursive descent parsers

It's known that recursive descent parsers may require exponential time in some cases; could anyone point me to the samples, where this happens? Especially interested in cases for PEG (i.e. with prioritized choices).

like image 742
fithu Avatar asked Jun 01 '13 05:06

fithu


People also ask

What are the limitations of recursive descent parser?

The main limitation of recursive descent parsing (and top-down parsing algorithms in general) is that they only work on grammars with certain properties. For example, if a grammar contains any left recursion, recursive descent parsing doesn't work.

What is meant by recursive descent parser?

In computer science, a recursive descent parser is a kind of top-down parser built from a set of mutually recursive procedures (or a non-recursive equivalent) where each such procedure implements one of the nonterminals of the grammar.

Which one is example of recursive descent parsing?

Recursive descent parsing is an example of top-down parsing.

What is a recursive descent parser and how does it work?

Recursive descent is a top-down parsing technique that constructs the parse tree from the top and the input is read from left to right. It uses procedures for every terminal and non-terminal entity. This parsing technique recursively parses the input to make a parse tree, which may or may not require back-tracking.


Video Answer


1 Answers

Any top down parser, including recursive descent, can theoretically become exponential if the combination of input and grammar are such that large numbers of backtracks are necessary. This happens if the grammar is such that determinative choices are placed at the end of long sequences. For example, if you have a symbol like & meaning "all previous minuses are actually plusses" and then have data like "((((a - b) - c) - d) - e &)" then the parser has to go backwards and change all the plusses to minuses. If you start making nested expressions along these lines you can create an effectively non-terminating set of input.

You have to realize you are stepping into a political issue here, because the reality is that most normal grammars and data sets are not like this, however, there are a LOT of people who systematically badmouth recursive descent because it is not easy to make RD automatically. All early parsers are LALR because they are MUCH easier to make automatically than RD. So what happened was that everyone just wrote LALR and badmouthed RD, because in the old days the only way to make an RD was to code it by hand. For example, if you read the dragon book you will find that Aho & Ullman write just one paragraph on RD, and it is basically just a ideological takedown saying "RD is bad, don't do it".

Of course, if you start hand coding RDs (as I have) you will find that they are much better than LALRs for a variety of reasons. In the old days you could always tell a compiler that had a hand-coded RD, because it had meaningful error messages with locational accuracy, whereas compilers with LALRs would show the error occurring like 50 lines away from where it really was. Things have changed a lot since the old days, but you should realize that when you start reading the FUD on RD, that it is coming from a long, long tradition of verbally trashing RD in "certain circles".

like image 55
Tyler Durden Avatar answered Oct 01 '22 12:10

Tyler Durden