Problem with incomplete input when using Attoparsec

Question

I am converting some functioning Haskell code that uses Parsec to instead use Attoparsec in the hope of getting better performance. I have made the changes and everything compiles but my parser does not work correctly.

I am parsing a file that consists of various record types, one per line. Each of my individual functions for parsing a record or comment works correctly but when I try to write a function to compile a sequence of records the parser always returns a partial result because it is expecting more input.

These are the two main variations that I've tried. Both have the same problem.

items :: Parser [Item]
items = sepBy (comment <|> recordType1 <|> recordType2) endOfLine

For this second one I changed the record/comment parsers to consume the end-of-line characters.

items :: Parser [Item]
items = manyTill (comment <|> recordType1 <|> recordType2) endOfInput

Is there anything wrong with my approach? Is there some other way to achieve what I am attempting?

Bryan O'Sullivan · Accepted Answer

If you write an attoparsec parser that consumes as much input as possible before failing, you must tell the partial result continuation when you've reached the end of your input.

Travis Brown · Answer

I've run into this problem before and my understanding is that it's caused by the way that <|> works in the definition of sepBy:

sepBy1 :: Alternative f => f a -> f s -> f [a]
sepBy1 p s = scan
    where scan = liftA2 (:) p ((s *> scan) <|> pure [])

This will only move to pure [] once (s *> scan) has failed, which won't happen just because you're at the end of the input.

My solution has been just to call feed with an empty ByteString on the Result returned by parse. This might be kind of a hack, but it also seems to be how attoparsec-iteratee deals with the issue:

f k (EOF Nothing)  = finalChunk $ feed (k S.empty) S.empty

As far as I can tell this is the only reason that attoparsec-iteratee works here and plain old parse doesn't.

Problem with incomplete input when using Attoparsec

Tags:

haskell

attoparsec

Dan Dyer

2 Answers

Bryan O'Sullivan

Travis Brown

Recent Activity

Donate For Us

Problem with incomplete input when using Attoparsec

Tags:

haskell

attoparsec

Dan Dyer

2 Answers

Bryan O'Sullivan

Travis Brown

Related questions

Recent Activity

Donate For Us