Has anyone seen a good comparison of parser generators' performance?
I'm particularly interested in: 1) recursive ascent parser generators for LALR(1) grammars; 2) parser generators which produce C/C++ based parsers.
Java Compiler Compiler (JavaCC) is the most popular parser generator for use with Java applications. A parser generator is a tool that reads a grammar specification and converts it to a Java program that can recognize matches to the grammar.
A parser generator is a good tool that you should make part of your toolbox. A parser generator takes a grammar as input and automatically generates source code that can parse streams of characters using the grammar.
Parser combinators are generally slower than a hand-written or code-generated parser. That's somewhat innate due to the overhead of “threading” (for lack of a better word) your control flow through many function calls.
As an example of a grammar with Python actions, the piece of the parser generator that parses grammar files is bootstrapped from a meta-grammar file with Python actions that generate the grammar tree as a result of the parsing.
Are you interested in how fast the parser generators run? Depends of the type of technology of the parsing engine it supports, and the care of the guy who implemented the parser generator. See this answer for some numbers about LALR/GLR parser generators for real languages: https://stackoverflow.com/a/14151966/120163 IMHO, this isn't very important; parser generators are mostly a lot faster than the guy using them.
If the question is, how fast are the generated parsers? you get different answers. LALR parsers can be implemented with a few machine instructions per GOTO transition (using directly-indexed GOTO tables), and a few per reduction. That's pretty hard to beat.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With