Are there any parser combinators library that gives performance comparable to Happy/Alex ?
I know about Attoparsec, but sometimes it operates not well, like in an example below:
isToken c = isLetter c || isDigit c
symbol :: Parser Expr
symbol = do
c <- skipSpace >> satisfy isLetter
rest <- takeWhile isToken
let token = C.cons c rest -- oops... O(N)
error $ show token
The workaround is quite ugly:
do { skipSpace; bs <- scan go True; when (null bs) (fail "Not a symbol"); return bs}
where go True c = if isLetter c then Just False else Nothing
go False c = if isToken c then Just Fasle else Nothing
Also, Attoparsec lacks of error handling.
Happy/Alex are quite unfriendly (for me) comparing to ocamlyacc/ocamllex, BNFC is inflexible and in my case requires an additional AST traversing after parsing. Also, error handling is not very good.
There are three of rest options: Parsec2, Parsec3 and uu-parselib. I've found a number of controversial benchmarks assuming that Parsec2 is faster than Parsec3, or UU is faster, or it's slower.
But what to choose? Does anyone have an experience using uu-parselib? I need the parser for some kind of DSL, need the parses fast enough to not to change it in future.
Haskell is an excellent language for all your parsing needs. The functional nature of the language makes it easy to compose different building blocks together without worrying about nasty side effects and unforeseen consequences.
Parser combinators are generally slower than a hand-written or code-generated parser. That's somewhat innate due to the overhead of “threading” (for lack of a better word) your control flow through many function calls.
A Parser combinator, as wikipedia describes it, is a higher-order function that accepts several parsers as input and returns a new parser as its output. They can be very powerful when you want to build modular parsers and leave them open for further extension.
There is another alternative: polyparse.
After last year's GSoC, parsec3 was optimized and no longer noticeably slower than parsec2
Couple of years ago I've done tests on several grammars (mid-size) and found that performance of happy/alex, parsec2/alex, parsec2 and polyparse is very close. Attoparsec was faster on byte streams, but I needed multi-byte.
My advise: take a look at the way alternatives handle internal and user-defined state and report errors and choose by these criteria.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With