I have a file with line endings “\r\r\n”, and use the parser eol = string "\r\r\n" :: Parser String
to handle them. To get a list of the lines between these separators, I would like to use sepBy
along with a parser that returns any text that would not be captured by eol
. Looking through the documentation I did not see a combinator that negates a parser (an ‘anything but the pattern ”\r\r\n”
’ parser).
I have tried using sepBy (many anyToken) end
, but many anyToken
appears to be greedy, not stopping for eol
matches. I cannot use many (noneOf "\n\r")
, because there are several places in my text with the single '\n'
character.
Is there a combinator that can get me the inverse of string "\r\r\n"
?
I'm afraid you're going about it backwards. Parsec parsers don't chop up the input, they build the output. The more you try to parse by thinking about what you don't want, the harder it'll be. You need to think bottom-up what's permissable, not top down where you chop.
You should start with the smallest, most basic thing you do want. For example, don't think of an identifier as everything before a space, think of it as a letter followed by alphanumeric data. You can then combine that, separated by whitespace with the other things you expect on a line.
line = do
i <- identifier
whiteSpace
string "="
e <- expr
return $ Line i e
Only when you've completed a parser that successfully parses what you want from a line and rejects invalid lines should you parse multiple lines:
lines = sepBy line eol
As a tentative answer, it looks like manyTill anyChar (try eol)
does what I want. As part of my original question though, I'm still interested in knowing whether there is a general way to negate a parser, or whether there's another recommended way of doing what I want.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With