Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Parser identifiers and free format text. Can this be done with FParsec?

Tags:

f#

fparsec

As a follow-on to: How do I test for exactly 2 characters with fparsec?

I need to parse a string that consists of pairs of identifiers followed by freeform text. I can easily construct a parser that finds the identifiers which are of the form of newline followed by exactly two uppercase characters followed by a space. The freeform text, which is associated with the preceding identifier, is everything following the identifier up to but not including the next identifier.

So for example:

AB Now is the
time for all good
men.
CD Four score and seven years ago EF our.

contains two identifiers AB and CD and two pieces of freeform text

Now is the \ntime for all good men.
Four score and seven years ago EF our.

My problem is I don't know how to construct a parser that would match the freeform text but not match the identifiers. Is this a case where I need to do backtracking?

Can this be done and if so how?

like image 291
JonnyBoats Avatar asked May 14 '13 03:05

JonnyBoats


2 Answers

Tarmil posted the straightforward solution.

Here's another variant, which doesn't need a newline at the beginning and which checks for a following identifier only at the end of lines:

let id = manyMinMaxSatisfyL 2 2 isUpper "ID" .>> pchar ' '

let text = 
    stringsSepBy (restOfLine true) 
                 ((notFollowedBy ((id >>% ()) <|> skipNewline <|> eof)) >>% "\n")

let parser = many (id .>>. text)

If you wanted to optimize the second parser used with the stringsSepBy combinator, you could replace it with the following version:

let notFollowedByIdOrEmptyLineOrEof : Parser<string,_> =
    fun stream ->
        let cs = stream.Peek2()
        let c0, c1 = cs.Char0, cs.Char1
        if c0 = '\r' || c0 = '\n' || c0 = EOS
           || (isUpper c0 && isUpper c1 && stream.Peek(2) = ' ')
        then Reply(Error, NoErrorMessages)
        else Reply("\n")

let text2 = stringsSepBy (restOfLine true) 
                         notFollowedByIdOrEmptyLineOrEof
like image 78
Stephan Tolksdorf Avatar answered Nov 17 '22 19:11

Stephan Tolksdorf


I think notFollowedBy is what you're looking for. This should do the trick:

// adapted from the other question
let identifier = skipNewline >>. manyMinMaxSatisfy 2 2 CharParsers.isUpper

let freeform = manyChars (notFollowedBy identifier >>. anyChar)
like image 3
Tarmil Avatar answered Nov 17 '22 20:11

Tarmil