Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

FParsec: how to omit `many` parser failures from error messages

Consider this parser that converts digit strings to ints:

let toInt (s:string) = 
    match Int32.TryParse(s) with
    | (true, n) -> preturn n
    | _         -> fail "Number must be below 2147483648"

let naturalNum = many1Chars digit >>= toInt <?> "natural number"

When I run it on non-numeric strings like "abc" it shows the correct error message:

Error in Ln: 1 Col: 1
abc
^
Expecting: natural number

But when I give it a numeric string exceeding the int range it gives the following counter-productive message:

Error in Ln: 1 Col: 17
9999999999999999
                ^
Note: The error occurred at the end of the input stream.
Expecting: decimal digit
Other error messages:
  Number must be below 2147483648

The primary message "Expecting: decimal digit" makes no sense, because we have to many digits already.

Is there a way to get rid of it and only show "Number must be below 2147483648"?


Full example:

open System
open FParsec

[<EntryPoint>]
let main argv =
    let toInt (s:string) = 
        match Int32.TryParse(s) with
        | (true, n) -> preturn n
        | _         -> fail "Number must be below 2147483648"

    let naturalNum = many1Chars digit >>= toInt <?> "natural number"

    match run naturalNum "9999999999999999" with
    | Failure (msg, _, _) -> printfn "%s" msg
    | Success (a, _, _)   -> printfn "%A" a

    0
like image 423
Good Night Nerd Pride Avatar asked May 24 '19 22:05

Good Night Nerd Pride


2 Answers

I think the root of the problem here is that this is a non-syntactic concern, which doesn't fit well with the model of a lookahead parser. If you could express "too many digits" in a syntactic way, it would make sense for the parser too, but as it is it will instead go back and try to consume more input. I think the cleanest solution therefore would be to do the int conversion in a separate pass after the parsing.

That said, FParsec seems flexible enough that you should still be able to hack it together. This does what you ask I think:

let naturalNum: Parser<int, _> =
    fun stream ->
        let reply = many1Chars digit stream
        match reply.Status with
            | Ok ->
                match Int32.TryParse(reply.Result) with
                | (true, n) -> Reply(n)
                | _         -> Reply(Error, messageError "Number must be below 2147483648")                
            | _ ->
                Reply(Error, reply.Error)

Or if you want the "natural number" error message instead of "decimal digit", replace the last line with:

Reply(Error, messageError "Expecting: natural number")
like image 115
glennsl Avatar answered Sep 27 '22 18:09

glennsl


The effect you see ist that the first parser of your sequence succeeds, but also generates an error message (because it could consume even more digits). Your second parser consumes no further input and if it fails FParsec will therefore merge the error messages of the two sequenced parsers (Manual on merging of error messages).

A solution would be to create a small wrapper for a parser, that removes error messages from a result in the Ok case. Then when sequenced with a second parser only the message of the second parser remain.

Untested code from the top of my head:

let purify p =
    fun stream ->
        let res = p stream
        match res.Status with
            | Ok -> Reply(res.Result)
            | _ -> res


let naturalNum = purify (many1Chars digit) >>= toInt <?> "natural number"
like image 23
mschmidt Avatar answered Sep 27 '22 17:09

mschmidt