Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Haskell Parser Combinators

I was reading a lot about Haskell Parser Combinators and found a lot of topics like:

  • Parsec vs Yacc/Bison/Antlr: Why and when to use Parsec?
  • Which Haskell parsing technology is most pleasant to use, and why?
  • Parsec or happy (with alex) or uu-parsinglib
  • Choosing a Haskell parser
  • What is the advantage of using a parser generator like happy as opposed to using parser combinators?

But all these topics compare Parser Combinators with Parser Generators.

I want to ask you which of Parser Combinator suits best the following conditions:

  1. I want to have good control about the errors (including error recovery) and messages for user
  2. I want to be able to fed the parser with small parts of text (not whole file at once)
  3. I want to be able to redesign nicely the grammar (I'm currently developing the grammar, so "nice waf of working" is important"
  4. The final parser should be fast (the performance is important, but not as much as points 1-3).

I've found out, that the most popular parser combinators are:

  • Parsec
  • uu-parsinglib
  • attoparsec
like image 481
Wojciech Danilo Avatar asked Aug 03 '13 01:08

Wojciech Danilo


People also ask

Are parser combinators slow?

Parser combinators are generally slower than a hand-written or code-generated parser. That's somewhat innate due to the overhead of “threading” (for lack of a better word) your control flow through many function calls.

What is a monadic parser?

A Parser combinator, as wikipedia describes it, is a higher-order function that accepts several parsers as input and returns a new parser as its output. They can be very powerful when you want to build modular parsers and leave them open for further extension.

How do parser generators work?

A parser generator takes a grammar as input and automatically generates source code that can parse streams of characters using the grammar. The generated code is a parser, which takes a sequence of characters and tries to match the sequence against the grammar.


1 Answers

I would say definitely go with Parsec, heres why:

Attoparsec is designed to be quick to use, but lacks the strong support for error messages you get in Parsec, so that is a win for your first point.

My experience of using parser combinator libraries is that it is really easy to test individual parts of the parsers, either in GHCi or in tests, so the second point is satisfied by all of them really. Lastly, Attoparsec and Parsec are pretty darn fast.

Finally, Parsec has been around longest and has many useful and advanced features. This means that general maintainability is going to be easier, more examples are in Parsec and more people are familiar with it. uu-parsinglib is definitely worth the time to explore, but I would suggest that getting familiar with Parsec first is the better course for these reasons. (Alex is also the most recommended lexer to use with Parsec or otherwise, but I have not used it myself.)

like image 66
Vic Smith Avatar answered Oct 17 '22 17:10

Vic Smith