How do hand-written recursive descent parsers (which are inevitably LL(k)) compare to generated LALR parsers in terms of performance?
I know that LALR parsers are able to handle far more grammars than LL(k); however it's my intention to write my parser by hand, and recursive descent seems the most appropriate choice. Is it possible to write any other kind by hand (reasonably readably) out of interest?
N.B. I am using a functional language with tail-call optimisation (F#), so [well-tailored] recursion won't be as much of an issue as in other languages.
One of major drawback or recursive-descent parsing is that it can be implemented only for those languages which support recursive procedure calls and it suffers from the problem of left-recursion.
A parser generator is a good tool that you should make part of your toolbox. A parser generator takes a grammar as input and automatically generates source code that can parse streams of characters using the grammar.
Many people use parser generators to automatically write a parser and lexer for them. There are many tools made to do this: lex, yacc, ragel. There's even a Go implementation of yacc built into the go toolchain.
I think a lot depends on the language you are trying to parse. Another part of performance which sometimes gets forgotten is the lexical analysis (scanning) part - it is significant for performance as it deals with characters rather than symbols. Recursive descent is a good first iteration at writing a parser, and it makes following the parsed language's logic quite natural. I think that if the parsed language fits (no left recursion) you should start with recursive descent. Choosing LALR for performance at this stage seems to be premature optimization. You can write a chart parser by hand, but I doubt this is what you mean. Writing an LALR parser by hand is possible but tedious.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With