Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Parser Combinators library of choice (haskell)

Are there any parser combinators library that gives performance comparable to Happy/Alex ?

I know about Attoparsec, but sometimes it operates not well, like in an example below:

isToken c = isLetter c || isDigit c

symbol :: Parser Expr
symbol = do 
    c    <- skipSpace >> satisfy isLetter 
    rest <- takeWhile isToken
    let token = C.cons c rest  -- oops... O(N)
    error $ show token

The workaround is quite ugly:

do { skipSpace; bs <- scan go True; when (null bs) (fail "Not a symbol"); return bs}
    where go True  c = if isLetter c then Just  False else Nothing
          go False c = if isToken c then Just Fasle else Nothing

Also, Attoparsec lacks of error handling.

Happy/Alex are quite unfriendly (for me) comparing to ocamlyacc/ocamllex, BNFC is inflexible and in my case requires an additional AST traversing after parsing. Also, error handling is not very good.

There are three of rest options: Parsec2, Parsec3 and uu-parselib. I've found a number of controversial benchmarks assuming that Parsec2 is faster than Parsec3, or UU is faster, or it's slower.

But what to choose? Does anyone have an experience using uu-parselib? I need the parser for some kind of DSL, need the parses fast enough to not to change it in future.

like image 701
voidlizard Avatar asked Jul 19 '11 04:07

voidlizard


People also ask

Is Haskell good for parsing?

Haskell is an excellent language for all your parsing needs. The functional nature of the language makes it easy to compose different building blocks together without worrying about nasty side effects and unforeseen consequences.

Are parser combinators slow?

Parser combinators are generally slower than a hand-written or code-generated parser. That's somewhat innate due to the overhead of “threading” (for lack of a better word) your control flow through many function calls.

What is a monadic parser?

A Parser combinator, as wikipedia describes it, is a higher-order function that accepts several parsers as input and returns a new parser as its output. They can be very powerful when you want to build modular parsers and leave them open for further extension.


1 Answers

  1. There is another alternative: polyparse.

  2. After last year's GSoC, parsec3 was optimized and no longer noticeably slower than parsec2

  3. Couple of years ago I've done tests on several grammars (mid-size) and found that performance of happy/alex, parsec2/alex, parsec2 and polyparse is very close. Attoparsec was faster on byte streams, but I needed multi-byte.

My advise: take a look at the way alternatives handle internal and user-defined state and report errors and choose by these criteria.

like image 158
ADEpt Avatar answered Sep 23 '22 21:09

ADEpt