Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Parsing separated lists with FParsec

Tags:

f#

fparsec

I am trying to parse something that may be a list of items, or which may be just one item. I want to put the results into a DU (Thing below).

The way I'm approaching this is as below, but it gives me a list of things even when there is only one thing in the list.

let test p str =
    match run p str with
    | Success(result, _, _)   -> printfn "Success: %A" result
    | Failure(errorMsg, _, _) -> printfn "Failure: %s" errorMsg

type Thing =
    | OneThing of int
    | LotsOfThings of Thing list

let str s = pstringCI s .>> spaces 

let one = str "one" |>> fun x -> OneThing 1
let two = str "two" |>> fun x -> OneThing 2
let three = str "three" |>> fun x -> OneThing 3

let oneThing = (one <|> two <|> three)
let lotsOfThings = sepBy1 oneThing (str "or") |>> LotsOfThings

let lotsFirst = (lotsOfThings <|> oneThing)
test lotsFirst "one or two" // Success: LotsOfThings [OneThing 1; OneThing 2]
test lotsFirst "one" // Success: LotsOfThings [OneThing 1]

What is the correct way to return OneThing when there is only one item in the list?

I can do that if I test the list before returning, like the below. But that doesn't really "feel" right.

let lotsOfThings = sepBy1 oneThing (str "or") |>> fun l -> if l.Length = 1 then l.[0] else l |> LotsOfThings

LinqPad of the above is here: http://share.linqpad.net/sd8tpj.linq

like image 689
Sean Kearon Avatar asked May 21 '18 19:05

Sean Kearon


1 Answers

If you don't like testing the list length after parsing, then you might try switching your <|> expression to test the single-item case first, and use notFollowedBy to ensure that the single-item case won't match a list:

let oneThing = (one <|> two <|> three)
let separator = str "or"
let lotsOfThings = sepBy1 oneThing separator |>> LotsOfThings

let oneThingOnly = oneThing .>> (notFollowedBy separator)
let lotsSecond = (attempt oneThingOnly) <|> lotsOfThings
test lotsSecond "one or two" // Success: LotsOfThings [OneThing 1; OneThing 2]
test lotsSecond "one" // Success: OneThing 1

Note the use of the attempt parser with oneThingOnly. That's because the documentation for the <|> parser states (emphasis in original):

The parser p1 <|> p2 first applies the parser p1. If p1 succeeds, the result of p1 is returned. If p1 fails with a non‐fatal error and without changing the parser state, the parser p2 is applied.

Without the attempt in there, "one or two" would first try to parse with oneThingOnly, which would consume the "one" and then fail on the "or", but the parser state would have been changed. The attempt combinator basically makes a "bookmark" of the parser state before trying a parser, and if that parser fails, it goes back to the "bookmark". So <|> would see an unchanged parser state after attempt oneThingOnly, and would then try lotsOfThings.

like image 64
rmunn Avatar answered Sep 23 '22 12:09

rmunn