Performance of parsers: PEG vs LALR(1) or LL(k)

Tags:

I've seen some claims that optimized PEG parsers in general cannot be faster than optimized LALR(1) or LL(k) parsers. (Of course, performance of parsing would depend on a particular grammar.)

I'd like to know if there are any specific limitations of PEG parsers, either valid in general or for some subsets of PEG grammars that would make them inferior to LALR(1) or LL(k) performance-wise.

In particular, I'm interested in parser generators, but assume that their output can be tweaked for performance in any particular case. I also assume that parsers are optimized and it is possible to tweak a particular grammar a bit if that's needed to improve performance.

707

asked Jul 07 '12 08:07

Roman Boiko

2 Answers

Found a good answer about Packrat vs LALR parsing. Some quotes from it:

L(AL)R parsers are linear time parsers, too. So in theory, neither packrat nor L(AL)R parsers are "faster".

What matters, in practice, of course, is implementation. L(AL)R state transitions can be executed in very few machine instructions ("look token code up in vector, get next state and action") so they can be extremely fast in practice.

An observation: most language front-ends don't spend most of their time "parsing"; rather, they spend a lot of time in lexical analysis. Optimize that ..., and the parser speed won't matter much.

116

answered Sep 19 '22 05:09

Roman Boiko

PEG parsers can use unlimited lookahead (while maintaining linear parse time on average, via packrat) unlike (default) LL(k), or LR(k) parsers which use limited lookahead, while maintining linear parse time.

Lately (2014-2015) ANTLR4 has made extensions to handle arbitrary lookahead (as in PEG) while maintaining linear parse time on average (said to be more efficient than packrat algorithm), however this is incorporates new extensions and variations of the LR parsing algorithm (and not the default LR algorithm).

The packrat parser (and associated parsers for LL, LR) is not necesarily practical, but provides theoretical bounds on parsing so comparison can be made.

But note that unlimited lookahead can be used to parse grammars/languages in linear time (e.g via packrat or antlr) which are not possible to parse via LL(k) or LR(k) even in non-linear time, So it is important to understand what is compared to what.

answered Sep 20 '22 05:09

Nikos M.

Related questions
                            
                                How to parse a String containing HTML to React Components
                            
                                Are there any javascript frameworks for parsing/auto-completing a domain specific language?
                            
                                I can never predict XMLReader behavior. Any tips on understanding?
                            
                                How can I create a parser combinator in which line endings are significant?
                            
                                Parsing a RFC 822 date with NSDateFormatter
                            
                                How to read XMI?
                            
                                lightweight javascript to javascript parser
                            
                                python read_fwf error: 'dtype is not supported with python-fwf parser'
                            
                                jQuery XML Parsing/Traversing
                            
                                PHP Looping Template Engine - From Scratch
                            
                                Theory of parsing and live syntax highlighting
                            
                                Making YACC output an AST (token tree)
                            
                                Parsing command line options with multiple arguments [getopt?]
                            
                                _splitpath in Linux
                            
                                Scala: How to combine parser combinators from different objects
                            
                                Parsing HTML with Python 2.7 - HTMLParser, SGMLParser, or Beautiful Soup?
                            
                                How can I parse ASCII Art to HTML using Java or Javascript? [closed]
                            
                                Haskell: How to parse an IO input string into a Float (or Int or whatever)?
                            
                                Parsec or happy (with alex) or uu-parsinglib
                            
                                What's the best tool to parse log files? [closed]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Performance of parsers: PEG vs LALR(1) or LL(k)

Tags:

parsing

ll

lalr

parser-generator

peg

Roman Boiko

People also ask

2 Answers

Roman Boiko

Nikos M.

Recent Activity

Donate For Us