Many of the Parsec combinators I use are of a type such as:
foo :: CharParser st Foo
CharParser
is defined here as:
type CharParser st = GenParser Char st
CharParser
is thus a type synonym involving GenParser
, itself defined here as:
type GenParser tok st = Parsec [tok] st
GenParser
is then another type synonym, assigned using Parsec
, defined here as:
type Parsec s u = ParsecT s u Identity
So Parsec
is a partial application of ParsecT
, itself listed here with type:
data ParsecT s u m a
along with the words:
"ParsecT s u m a is a parser with stream type s, user state type u, underlying monad m and return type a."
What is the underlying monad? In particular, what is it when I use the CharParser
parsers? I can't see where it's inserted in the stack. Is there a relationship to the use of the list monad in Monadic Parsing in Haskell to return multiple successful parses from an ambiguous parser?
In your case the underlying monad is Identity
. However ParsecT is different from most monad transformers in that it is an instance of the Monad
class even if the type parameter m
is not. If you look at the source code you will note the lack of "(Monad m) =>
" in the instance declaration.
So then you ask yourself, "If I were to have a non-trivial monad stack, where would it be used?"
There are a three of answers to that question:
It is used to uncons
the next token out of the stream:
class (Monad m) => Stream s m t | s -> t where
uncons :: s -> m (Maybe (t,s))
Notice that uncons
takes an s
(the stream of tokens t
) and returns its result wrapped in your monad. This allows one to do interesting thing while or even during the process of getting the next token.
It is used in the resulting output of each parser. This means you can create parsers that don't touch the input but take action in the underlying monad and use the combinators to bind them to regular parsers. In other words, lift (x :: m a) :: ParsecT s u m a
.
Finally, the end result of RunParsecT and friends (until you build up to the point where m
is replaced by Identity
) return their results wrapped in this monad.
There is not a relationship between this monad and the one from Monadic Parsing in Haskell. In this case Hutton and Meijer are referring to the monad instance for ParsecT itself. The fact that in Parsec-3.0.0 and beyond ParsecT has become a monad transformer with an underlying monad is not relevant to the paper.
What I think you are looking for however is where the list of possible results went. In Hutton and Meijer the parser returns a list of all possible results while Parsec stubbornly returns only one. I think you are looking at the m
in the result and thinking to yourself that the list of results must be hiding in there somewhere. It is not.
Parsec, for reasons of efficiency, made a choice to prefer the first matching result in Hutton and Meijer's list of results. This let's it toss away both the unused results in the tail of Hutton and Meijer's list and also the front of the stream of tokens because we never backtrack. In parsec, given the combined parser a <|> b
, if a
consumes any input b
will never be evaluated. The way around this is try
which will reset the state back to where it was if a
fails then evaluate b
.
You asked in the comments if this was done using Maybe
or Either
. The answer is "almost but not quite." If you look at the low lever run*
functions you see that they return an Algebraic type which tell weather input was consumed then a second which give either the result or an error message. These types work kind of like Either
, but even they are not used directly. Rather then stretch this out further, I'll refer you to the post by Antoine Latter that explains how this works and why it is done this way.
GenParser is defined in terms of Parsec, not ParsecT. Parsec in turn is defined as
type Parsec s u = ParsecT s u Identity
So the answer is that when using CharParser the underlying monad is the Identity monad.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With