Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Underlying Parsec Monad

Many of the Parsec combinators I use are of a type such as:

foo :: CharParser st Foo

CharParser is defined here as:

type CharParser st = GenParser Char st

CharParser is thus a type synonym involving GenParser, itself defined here as:

type GenParser tok st = Parsec [tok] st

GenParser is then another type synonym, assigned using Parsec, defined here as:

type Parsec s u = ParsecT s u Identity

So Parsec is a partial application of ParsecT, itself listed here with type:

data ParsecT s u m a

along with the words:

"ParsecT s u m a is a parser with stream type s, user state type u, underlying monad m and return type a."

What is the underlying monad? In particular, what is it when I use the CharParser parsers? I can't see where it's inserted in the stack. Is there a relationship to the use of the list monad in Monadic Parsing in Haskell to return multiple successful parses from an ambiguous parser?

like image 450
user2023370 Avatar asked Nov 08 '11 16:11

user2023370


2 Answers

In your case the underlying monad is Identity. However ParsecT is different from most monad transformers in that it is an instance of the Monad class even if the type parameter m is not. If you look at the source code you will note the lack of "(Monad m) =>" in the instance declaration.

So then you ask yourself, "If I were to have a non-trivial monad stack, where would it be used?"

There are a three of answers to that question:

  1. It is used to uncons the next token out of the stream:

    class (Monad m) => Stream s m t | s -> t where
        uncons :: s -> m (Maybe (t,s))
    

    Notice that uncons takes an s (the stream of tokens t) and returns its result wrapped in your monad. This allows one to do interesting thing while or even during the process of getting the next token.

  2. It is used in the resulting output of each parser. This means you can create parsers that don't touch the input but take action in the underlying monad and use the combinators to bind them to regular parsers. In other words, lift (x :: m a) :: ParsecT s u m a.

  3. Finally, the end result of RunParsecT and friends (until you build up to the point where m is replaced by Identity) return their results wrapped in this monad.

There is not a relationship between this monad and the one from Monadic Parsing in Haskell. In this case Hutton and Meijer are referring to the monad instance for ParsecT itself. The fact that in Parsec-3.0.0 and beyond ParsecT has become a monad transformer with an underlying monad is not relevant to the paper.

What I think you are looking for however is where the list of possible results went. In Hutton and Meijer the parser returns a list of all possible results while Parsec stubbornly returns only one. I think you are looking at the m in the result and thinking to yourself that the list of results must be hiding in there somewhere. It is not.

Parsec, for reasons of efficiency, made a choice to prefer the first matching result in Hutton and Meijer's list of results. This let's it toss away both the unused results in the tail of Hutton and Meijer's list and also the front of the stream of tokens because we never backtrack. In parsec, given the combined parser a <|> b, if a consumes any input b will never be evaluated. The way around this is try which will reset the state back to where it was if a fails then evaluate b.

You asked in the comments if this was done using Maybe or Either. The answer is "almost but not quite." If you look at the low lever run* functions you see that they return an Algebraic type which tell weather input was consumed then a second which give either the result or an error message. These types work kind of like Either, but even they are not used directly. Rather then stretch this out further, I'll refer you to the post by Antoine Latter that explains how this works and why it is done this way.

like image 93
John F. Miller Avatar answered Oct 21 '22 09:10

John F. Miller


GenParser is defined in terms of Parsec, not ParsecT. Parsec in turn is defined as

type Parsec s u = ParsecT s u Identity

So the answer is that when using CharParser the underlying monad is the Identity monad.

like image 35
phyrex1an Avatar answered Oct 21 '22 11:10

phyrex1an