Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to use Control.Monad.State with Parsec?

Tags:

haskell

parsec

I'm surprised that I could not find any info on this. I must be the only person having any trouble with it.

So, let's say I have a dash counter. I want it to count the number of dashes in the string, and return the string. Pretend I gave an example that won't work using parsec's state handling. So this should work:

dashCounter = do
  str <- many1 dash
  count <- get
  return (count,str)


dash = do
  char '-'
  modify (+1)

And indeed, this compiles. Okay, so I try to use it:

:t parse dashCounter "" "----"
parse dashCounter "" "----"
  :: (Control.Monad.State.Class.MonadState
        t Data.Functor.Identity.Identity,
      Num t) =>
     Either ParseError (t, [Char])

Okay, that makes sense. It should return the state and the string. Cool.

>parse dashCounter "" "----"

<interactive>:1:7:
    No instance for (Control.Monad.State.Class.MonadState
                       t0 Data.Functor.Identity.Identity)
      arising from a use of `dashCounter'
    Possible fix:
      add an instance declaration for
      (Control.Monad.State.Class.MonadState
         t0 Data.Functor.Identity.Identity)
    In the first argument of `parse', namely `dashCounter'
    In the expression: parse dashCounter "" "----"
    In an equation for `it': it = parse dashCounter "" "----"

Oops. But then how could it have ever hoped to work in the first place? There's no way to input the initial state.

There is also a function:

>runPT dashCounter (0::Int) "" "----"

But it gives a similar error.

<interactive>:1:7:
    No instance for (Control.Monad.State.Class.MonadState Int m0)
      arising from a use of `dashCounter'
    Possible fix:
      add an instance declaration for
      (Control.Monad.State.Class.MonadState Int m0)
    In the first argument of `runPT', namely `dashCounter'
    In the expression: runPT dashCounter (0 :: Int) "" "----"
    In an equation for `it':
        it = runPT dashCounter (0 :: Int) "" "----"

I feel like I should have to runState on it, or there should be a function that already does it internally, but I can't seem to figure out where to go from here.

Edit: I should have specified more clearly, I did not want to use parsec's state handling. The reason is I have a feeling I don't want its backtracking to affect what it collects with the problem I'm preparing to solve it with.

However, Mr. McCann has figured out how this should fit together and the final code would look like this:

dashCounter = do
  str <- many1 dash
  count <- get
  return (count,str)

dash = do
  c <- char '-'
  modify (+1)
  return c

test = runState (runPT dashCounter () "" "----------") 0

Thanks a lot.

like image 225
David McHealy Avatar asked Jul 29 '11 17:07

David McHealy


3 Answers

You've actually got multiple problems going on here, all of which are relatively non-obvious the first time around.

Starting with the simplest: dash is returning (), which doesn't seem to be what you want given that you're collecting the results. You probably wanted something like dash = char '-' <* modify (+1). (Note that I'm using an operator from Control.Applicative here, because it looks tidier)

Next, clearing up a point of confusion: When you get the reasonable-looking type signature in GHCi, note the context of (Control.Monad.State.Class.MonadState t Data.Functor.Identity.Identity, Num t). That's not saying what things are, it's telling you want they need to be. Nothing guarantees that the instances it's asking for exist and, in fact, they don't. Identity is not a state monad!

On the other hand, you're absolutely correct in thinking that parse doesn't make sense; you can't use it here. Consider its type: Stream s Identity t => Parsec s () a -> SourceName -> s -> Either ParseError a. As is customary with monad transformers, Parsec is an synonym for ParsecT applied to the identity monad. And while ParsecT does provide user state, you apparently don't want to use it, and ParsecT does not give an instance of MonadState anyhow. Here's the only relevant instance: MonadState s m => MonadState s (ParsecT s' u m). In other words, to treat a parser as a state monad you have to apply ParsecT to some other state monad.

This sort of brings us to the next problem: Ambiguity. You're using a lot of type class methods and no type signatures, so you're likely to run into situations where GHC can't know what type you actually want, so you have to tell it.

Now, as a quick solution, let's first define a type synonym to give a name to the monad transformer stack we want:

type StateParse a = ParsecT String () (StateT Int Identity) a

Give dashCounter the relevant type signature:

dashCounter :: StateParse (Int, String)
dashCounter = do str <- many1 dash
                 count <- get
                 return (count,str)

And add a special-purpose "run" function:

runStateParse p sn inp count = runIdentity $ runStateT (runPT p () sn inp) count

Now, in GHCi:

Main> runStateParse dashCounter "" "---" 0
(Right (3,"---"),3)

Also, note that it's pretty common to use a newtype around a transformer stack instead of just a type synonym. This can help with the ambiguity issues in some cases, and obviously avoids ending up with gigantic type signatures.

like image 126
C. A. McCann Avatar answered Nov 17 '22 14:11

C. A. McCann


If you want to use the user state component Parsec offers as a built-in feature, then you can use the getState and modifyState monadic functions.

I tried to stay true to your example program, though using the return of dash doesn't seem useful.

import Text.Parsec

dashCounter :: Parsec String Int (Int, [()])
dashCounter = do
  str <- many1 dash
  count <- getState
  return (count,str)

dash :: Parsec String Int ()
dash = do
  char '-'
  modifyState (+1)

test = runP dashCounter 0 "" "---"

Note that runP is indeed addressing your concern about runState.

like image 40
Anthony Avatar answered Nov 17 '22 14:11

Anthony


Whilst these answers sort out this specific problem, they ignore the more serious underlying issue with an approach like this. I would like to describe it here for anyone else looking at this answer.

There is a difference between the user state and using the StateT transformer. The internal user state is reset on backtracking but StateT is not. Consider the following code. We want to add one to our counter if there is a dash and two if there is a plus. They produce different results.

As can be seen both using the internal state and attaching a StateT transformer provide the correct result. The latter comes at the expense of having to explicitly lift operations and be much more careful with types.

import Text.Parsec hiding (State)
import Control.Monad.State
import Control.Monad.Identity

f :: ParsecT String Int Identity Int
f = do
  try dash <|> plus
  getState

dash = do
  modifyState (+1)
  char '-'
plus = do
  modifyState (+2)
  char '+'

f' :: ParsecT String () (State Int) ()
f' = void (try dash' <|> plus')

dash' = do
  modify (+1)
  char '-'

plus' = do
  modify (+2)
  char '+'

f'' :: StateT Int (Parsec String ()) ()
f'' = void (dash'' <|> plus'')

dash'' :: StateT Int (Parsec String ()) Char
dash'' = do
  modify (+1)
  lift $ char '-'

plus'' :: StateT Int (Parsec String ()) Char
plus'' = do
  modify (+2)
  lift $ char '+'

This is the result of running f, f' and f''.

*Main> runParser f 0 "" "+"
Right 2
*Main> flip runState 0 $ runPT f' () "" "+"
(Right (),3)
*Main> runParser (runStateT f'' 0) () "" "+"
Right ((),2)
like image 6
Matthew Pickering Avatar answered Nov 17 '22 12:11

Matthew Pickering