I have this test program:
open FParsec
let test p str =
match run p str with
| Success(result, _, _) -> printfn "Success: %A" result
| Failure(errorMsg, _, _) -> printfn "Failure: %s" errorMsg
let str s = pstring s
let sepPart = skipNewline >>. pstring "-"
let part = manyChars (notFollowedBy sepPart >>. anyChar)
[<EntryPoint>]
let main argv =
let s = "AA 12345\nBB 6789\n-----\nCC 9876\nDD 54321\n-----"
test part s
test (many part) s
0 // return an integer exit code
The line {test part s} works as expected but the next line, {test (many part) s} fails and I don't understand what I am doing wrong.
EDIT:
To clarify, what I am trying to do is have {test (many part) s} return ["AA 12345\nBB 6789"; "CC 9876\nDD 54321"]. In words, what I have is an input string composed of "pars" or "chunks" separated by lines with all dashes. For output I want an array where each element is one of the parts and the lines with dashes are simply discarded.
When you execute your example, FParsec throws an exception with the following message:
Additional information: (Ln: 2, Col: 8): The combinator 'many' was applied to a parser that succeeds without consuming input and without changing the parser state in any other way. (If no exception had been raised, the combinator likely would have entered an infinite loop.)
The problem is that your part
parser always succeeds, even if it can only parse an empty string. You can solve that problem by replacing manyChars
in the definition of part
with many1Chars
.
If you search for e.g. "applied to a parser that succeeds without consuming input" you'll find several discussions of similar errors on the internet, including one in FParse's user guide: http://www.quanttec.com/fparsec/users-guide/parsing-sequences.html#the-many-parser
Update: Here's a straightforward parser definition that works:
let sepPart = skipNewline
>>? (skipMany1SatisfyL ((=) '-') "'-'"
>>. (skipNewline <|> eof))
let part = many1CharsTill anyChar sepPart
let parser = many part
Note that I'm using >>?
in the definition of sepPart
to allow this parser to backtrack to the beginning if a newline is not followed by a dash. Alternatively you could also use attempt (skipNewline >>. ...)
, which would also backtrack for errors after the initial dash. The documentation for many[Chars]Till p endp
states an equivalence with many (notFollowedBy endp >>. p) .>> endp
that is not strictly true, because many[Chars]Till
does not backtrack like notFollowedBy
. I will clarify the documentation.
It's better for performance if you avoid backtracking using many[Chars]Till
or notFollowedBy
where possible. For example, you could also parse your chunks of lines as follows:
let id = manyMinMaxSatisfyL 2 2 isUpper "id (two capital letters)"
let line = id .>>. (pchar ' ' >>. restOfLine true)
let separator = many1SatisfyL ((=) '-') "dash separator"
>>. (skipNewline <|> eof)
let chunk = many1 line
let parser = sepEndBy1 chunk separator
Note that this implementation doesn't require the last chunk to be ended by a separator. If you want that, you could instead use:
let chunk = many line .>> separator
let parser = many chunk
If you want to allow empty chunks with the sepEndBy
definition, you could use:
let chunk = many1 line <|> (notFollowedByEof >>% [])
let parser = sepEndBy1 chunk separator
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With