import Text.ParserCombinators.Parsec
delimiter :: Parser ()
delimiter = do char '|'
return ()
<?> "delimiter"
eol :: Parser ()
eol = do oneOf "\n\r"
return ()
<?> "end of line"
item :: Parser String
item = do entry <- manyTill anyChar (try eol <|> try delimiter <|> eof)
return entry
items :: Parser [String]
items = do result <- many item
return result
When I run parseTest items "a|b|c"
with the code above I get the following error:
*** Exception: Text.ParserCombinators.Parsec.Prim.many:
combinator 'many' is applied to a parser that accepts an empty string.
I believe it has something to do with eof
and many item
, if I remove eof
, then I can get it to work as long as the line does not end in eof
, which makes it kind of useless.
I realize I could just use sepBy
but what I am interested in is why this code does not work and how to make it work.
A parser like many
can indeed not be applied to parsers that accept the empty string, because this makes the grammar ambiguous: How often do you parse the empty string? Choosing different numbers can lead to different parse results ...
You are right to assume that many item
is the problematic combination. An item
is defined in terms of manyTill
. (Excursion: Btw, you can simplify manyTill
to
item :: Parser String
item = manyTill anyChar (eol <|> delimiter <|> eof)
No need for the do
or the return
, and no need for try
, because each of the three parsers
expect different first tokens.) The parser manyTill
thus parses an arbitrary number of characters, followed by either an eol
, a delimiter
, or an eof
. Now, eol
and delimiter
actually consume at least one character when they succeed, but eof
doesn't. The parser eof
succeeds at the end of the input, but it can be applied multiple times. For example,
ghci> parseTest (do { eof; eof }) ""
()
It doesn't consume any input, and is thereby making it possible for item
to succeed on the empty string (at the end of your input), and is thereby causing the ambiguity.
To fix this, you can indeed rewrite your grammar and move to something like sepBy
, or you can try to distinguish normal item
s (where eof
isn't allowed as end-marker) from the final item
(where eof
is allowed).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With