Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Parsing int or float with FParsec

Tags:

f#

fparsec

I'm trying to parse a file, using FParsec, which consists of either float or int values. I'm facing two problems that I can't find a good solution for.

1

Both pint32 and pfloat will successfully parse the same string, but give different answers, e.g pint32 will return 3 when parsing the string "3.0" and pfloat will return 3.0 when parsing the same string. Is it possible to try parsing a floating point value using pint32 and have it fail if the string is "3.0"?

In other words, is there a way to make the following code work:

let parseFloatOrInt lines =
    let rec loop intvalues floatvalues lines =
        match lines with
        | [] -> floatvalues, intvalues
        | line::rest ->
            match run floatWs line with
            | Success (r, _, _) -> loop intvalues (r::floatvalues) rest
            | Failure _ -> 
                match run intWs line with
                | Success (r, _, _) -> loop (r::intvalues) floatvalues rest
                | Failure _ -> loop intvalues floatvalues rest

    loop [] [] lines

This piece of code will correctly place all floating point values in the floatvalues list, but because pfloat returns "3.0" when parsing the string "3", all integer values will also be placed in the floatvalues list.

2

The above code example seems a bit clumsy to me, so I'm guessing there must be a better way to do it. I considered combining them using choice, however both parsers must return the same type for that to work. I guess I could make a discriminated union with one option for float and one for int and convert the output from pint32 and pfloat using the |>> operator. However, I'm wondering if there is a better solution?

like image 962
larsjr Avatar asked Feb 03 '16 10:02

larsjr


2 Answers

You're on the right path thinking about defining domain data and separating definition of parsers and their usage on source data. This seems to be a good approach, because as your real-life project grows further, you would probably need more data types.

Here's how I would write it:

/// The resulting type, or DSL
type MyData =
    | IntValue of int
    | FloatValue of float
    | Error  // special case for all parse failures

// Then, let's define individual parsers:
let pMyInt =
    pint32
    |>> IntValue

// this is an alternative version of float parser.
// it ensures that the value has non-zero fractional part.
// caveat: the naive approach would treat values like 42.0 as integer
let pMyFloat =
    pfloat
    >>= (fun x -> if x % 1 = 0 then fail "Not a float" else preturn (FloatValue x))
let pError =
    // this parser must consume some input,
    // otherwise combined with `many` it would hang in a dead loop
    skipAnyChar
    >>. preturn Error

 // Now, the combined parser:
let pCombined =
    [ pMyFloat; pMyInt; pError ]    // note, future parsers will be added here;
                                    // mind the order as float supersedes the int,
                                    // and Error must be the last
    |> List.map (fun p -> p .>> ws) // I'm too lazy to add whitespase skipping
                                    // into each individual parser
    |> List.map attempt             // each parser is optional
    |> choice                       // on each iteration, one of the parsers must succeed
    |> many                         // a loop

Note, the code above is capable working with any sources: strings, streams, or whatever. Your real app may need to work with files, but unit testing can be simplified by using just string list.

// Now, applying the parser somewhere in the code:
let maybeParseResult =
    match run pCombined myStringData with
    | Success(result, _, _) -> Some result
    | Failure(_, _, _)      -> None // or anything that indicates general parse failure

UPD. I have edited the code according to comments. pMyFloat was updated to ensure that the parsed value has non-zero fractional part.

like image 178
bytebuster Avatar answered Oct 19 '22 08:10

bytebuster


FParsec has the numberLiteral parser that can be used to solve the problem.

As a start you can use the example available at the link above:

open FParsec
open FParsec.Primitives
open FParsec.CharParsers

type Number = Int   of int64
            | Float of float

                    // -?[0-9]+(\.[0-9]*)?([eE][+-]?[0-9]+)?
let numberFormat =     NumberLiteralOptions.AllowMinusSign
                   ||| NumberLiteralOptions.AllowFraction
                   ||| NumberLiteralOptions.AllowExponent

let pnumber : Parser<Number, unit> =
    numberLiteral numberFormat "number"
    |>> fun nl ->
            if nl.IsInteger then Int (int64 nl.String)
            else Float (float nl.String)```
like image 45
Mark Shevchenko Avatar answered Oct 19 '22 06:10

Mark Shevchenko