I need to give a failure message to a given position in parsec.
I tried by setting the position before giving an unexpected error message, but it didn't work:
runParser ( do pos0 <- getPosition
id <- many1 alphaNum
if (id == reverse id) then return id
else setPosition pos0 >> unexpected id
eof )
() "" "abccbb"
Gives back
Left (line 1, column 7):
unexpected end of input
expecting letter or digit
While the correct response is:
unexpected abccbb
expecting letter or digit
It can be produced (with a wrong position), by omitting setPosition pos0 >>
from the code.
My workaround is to do the parsing, save the correct and the actual error position in the user state of parsec, and correct the error position, but I would like a better solution.
As it was asked by AndrewC, it is part of giving error messages with more information to our users. For example, in some places we want special identifiers, but if it was encoded in the parser, parsec would given an error message like "expected a g, got an r, position is in the middle of an identifier". The correct message would be, "identifier expected in the special format, but got 'abccbb', position is before the identifier". If there is a better approach that can be used to give error messages like this, it would be a correct answer to our question. But I 'm also curious about why parsec behaves like that, and why cannot I raise a custom error message , pointing to the position I want to.
This is because the parser collects all errors that occurred at the furthest position in the input. When binding two parsers, any errors detected by those parsers are merged by mergeError
:
mergeError :: ParseError -> ParseError -> ParseError
mergeError e1@(ParseError pos1 msgs1) e2@(ParseError pos2 msgs2)
-- prefer meaningful errors
| null msgs2 && not (null msgs1) = e1
| null msgs1 && not (null msgs2) = e2
| otherwise
= case pos1 `compare` pos2 of
-- select the longest match
EQ -> ParseError pos1 (msgs1 ++ msgs2)
GT -> e1
LT -> e2
In your example, the many1
reaches the end-of-string, and generates an error at column 7. This error does not result in failure, but it is remembered. When you set the column back to 1, and use unexpected
, it creates an error in column 1. The bind operator applies mergeError
to the two errors, and the one at column 7 wins.
Using lookAhead
, we can write a function isolate
to run a parser p
without appearing to consume any input or register any errors. The isolate
parser returns a tuple containing the result of p
and the parser state at the end of p
so that we can jump back to that state if we so desire:
isolate :: Stream s m t => ParsecT s u m a -> ParsecT s u m (a, (State s u))
isolate p = try . lookAhead $ do
x <- p
s <- getParserState
return (x, s)
With that, we can implement a palindrome
parser:
palindrome = ( do
(id, s) <- isolate $ many1 alphaNum
if (id == reverse id) then (setParserState s >> return id)
else unexpected $ show id
) <?> "palindrome"
This runs the many1 alphaNum
parser in an isolated context that does not appear to have consumed any input. If the result is a palindrome, we set the parser state back to where it was at the end of the many1 alphaNum
and return its result. Otherwise, we report an unexpected id
error, which will be registered at the position where the many1 alphaNum
started.
So now,
main :: IO ()
main = print $ runParser (palindrome <* eof) () "" "Bolton"
Prints:
Left (line 1, column 1):
unexpected "Bolton"
expecting palindrome
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With