I recently learned about Pratt parsers from this excellent article and found Pratt parsers way simpler and much more elegant than recursive descent parsers. I tried to find a bit more info about how they compare to other parser types, but found that the Wikipedia article is barely a stub and the amount of bigger projects that use it that I could find equals two.
Why are Pratt parsers so little used? Do they have any severe limitations or disadvantages that I don't know about? How exactly do they compare to other parser types? When should and when shouldn't I use them?
Pratt parsing is a type of parsing introduced by Vaughan Pratt in a 1973 paper (see References). It is also known as “top-down operator-precedence parsing” because it is a top-down, recursive algorithm which can easily handle operator precedences.
Recursive descent is a top-down parsing technique that constructs the parse tree from the top and the input is read from left to right. It uses procedures for every terminal and non-terminal entity. This parsing technique recursively parses the input to make a parse tree, which may or may not require back-tracking.
There's very little difference between a Pratt parser and the so-called "shunting yard" parser (which has a much longer Wikipedia article attached); the main difference is that Pratt uses recursion and hence the stack, while Djikstra (the "shunting yard") keeps an explicit stack. Other than that, they do exactly the same sequence of operations. I suppose that Djikstra's expression of the algorithm is more common because of recursophobia.
There are some advantages to using the program stack; one of them is that it's easier to maintain type safety, since the entire stack doesn't have to be one type. On the other hand, many expression parsers only have one type.
The Dragon Book includes an algorthm which will generate an operator precedence table from a grammar. As it points out, the fact that the algorithm succeeds does not necessarily imply that the operator precedence parser will parse exactly the same language. There's more interesting information there which I've certainly forgotten; if you're interested in the algorithm, that's one of the places you could look. It includes the interesting insight that the < and > precendence relationship operators can be generated by looking at a derivation, if you surround the result of a production with < and > in the obvious way.
On the whole, my experience is that most of the time when you find a blog post saying "My God, I've just stumbled upon X and it's great and why don't more people know about it????", the answer is "Please don't assume that your ignorance is universal." But maybe I'm just in a cynical mood today.
By the way, the Lua
parser is a hand-built recursive descent parser which uses Pratt style parsing to parse expressions; that's a pretty common technique, I think, and you'll probably find it in other places, although you might have to winnow through the code to see the pattern.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With