Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Parsec: difference between "try" and "lookAhead"?

Tags:

haskell

parsec

What is the difference between "try" and "lookAhead" functions in parsec?

like image 495
r.sendecky Avatar asked Nov 16 '13 15:11

r.sendecky


1 Answers

The combinators try and lookAhead are similar in that they both let Parsec "rewind", but they apply in different circumstances. In particular, try rewinds failure while lookAhead rewinds success.

By the documentation, try "pretends that it hasn't consumed any input when an error occurs" while lookAhead p "parses p without consuming any input", but "if p fails and consumes some input, so does lookAhead".

So if you think of a parser running as walking along some streaming state and either failing or succeeding, which we might write in Haskell terms as

type Parser a = [Tokens] -> (Either Error a, [Tokens])

then try ensures that if (try p) input ---> (Left err, output) then input == output and lookAhead has it such that (lookAhead p) input ---> (Right a, output) then input == output, but if (lookAhead p) input ---> (Left err, output) then they may be allowed to differ.


We can see this in action by looking at the code for Parsec directly, which is somewhat more complex than my notion of Parser above. First we examine ParsecT

newtype ParsecT s u m a
    = ParsecT {unParser :: forall b .
                 State s u
              -> (a -> State s u -> ParseError -> m b) -- consumed ok
              -> (ParseError -> m b)                   -- consumed err
              -> (a -> State s u -> ParseError -> m b) -- empty ok
              -> (ParseError -> m b)                   -- empty err
              -> m b
             }

ParsecT is a continuation-based datatype. If you look at how one of them is constructed

ParsecT $ \s cok cerr eok eerr -> ...

You'll see how we have access to the State s u, s, and four functions which determine how we move forward. For instance, the fail clause of ParsecT's Monad instance uses the eerr option, constructing a ParseError from the current input position and the passed error message.

parserFail :: String -> ParsecT s u m a
parserFail msg
    = ParsecT $ \s _ _ _ eerr ->
      eerr $ newErrorMessage (Message msg) (statePos s)

While the most primitive successful token parse (tokenPrim) uses a complex sequence of events eventually culminating in calling cok with an updated State s u.

With this intuition, the source for try is particularly simple.

try :: ParsecT s u m a -> ParsecT s u m a
try p =
    ParsecT $ \s cok _ eok eerr ->
    unParser p s cok eerr eok eerr

It simply builds a new ParsecT based on the one passed to try, but with the "empty err" continuation in place of the consumed error. Whatever parsing combinator next sees try p will be unable to access its actual "consumed err" continuation and thus try is protected from changing its state on errors.

But lookAhead is more sophisticated

lookAhead :: (Stream s m t) => ParsecT s u m a -> ParsecT s u m a
lookAhead p         = do{ state <- getParserState
                        ; x <- p'
                        ; setParserState state
                        ; return x
                        }
    where
    p' = ParsecT $ \s cok cerr eok eerr ->
         unParser p s eok cerr eok eerr

Examining just the where-clause we see it depends on modifying the passed parser p to use the "empty ok" continuation in place of the "consumed ok" continuation. This is symmetric to what try did. Further, it ensures that the parser state is unaffected by whatever happens when this modified p' is run via its do-block.

like image 182
J. Abrahamson Avatar answered Nov 17 '22 02:11

J. Abrahamson