Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Parsing an integer with nom always results in Incomplete

Tags:

rust

nom

Everything I try gives me Incomplete(Size(1)). My best guess right now is:

named!(my_u64(&str) -> u64,
    map_res!(recognize!(nom::digit), u64::from_str)
);

Test:

#[cfg(test)]
mod test {
    #[test]
    fn my_u64() {
        assert_eq!(Ok(("", 0)), super::my_u64("0"));
    }
}

Sometimes in my variations (e.g. adding complete!) I've been able to get it to parse if I add a character onto the end.

I'd like to get a working parse for this (ultimately my hope is that this will allow me to create a parser for a u64 wrapper type) but bigger picture I'd like to get a grasp of how to build a parser properly myself.

like image 905
spease Avatar asked Jul 10 '18 03:07

spease


2 Answers

As of nom 5.1.1 approach towards combining parsers changed from macro-based to function based, what is discussed broader in nom's author blog.

Along with this change another followed - streaming and complete parsers are now residing in different modules and you need to explicitly choose which type of parsing you need. Most usually there is a clear distinction with module name.

Old macros are preserved, but they work strictly in streaming mode. Types like CompleteStr or CompleteByteSlice are gone.

To write code you asked for the new way you could do it for example like this (notice explicit character::complete in imports)

Since it took me some time to grasp it - parsers e.g map_res return a impl Fn(I) -> IResult<I, O2, E> which is why there is additional pair of parenthesis - to call that closure.

use std::str;
use nom::{
    IResult,
    character::complete::{
        digit1
    },
    combinator::{
        recognize,
        map_res
    }
};

fn my_u64(input : &str) -> IResult<&str, u64> {
    map_res(recognize(digit1), str::parse)(input)
}

#[cfg(test)]
mod test {
    use super::*;
    #[test]
    fn test_my_u64() {
        let input = "42";
        let num = my_u64(input);
        assert_eq!(Ok(("", 42u64)), num);
    }
}
like image 62
GrayCat Avatar answered Nov 03 '22 04:11

GrayCat


Nom 4 made the handling of partial data much stricter than in previous versions, to better support streaming parsers and custom input types.

Effectively, if the parser runs out of input and it can't tell that it's meant to have run out of input, it'll always return Err::Incomplete. This may also contain information on exactly how much more input the parser was expecting (in your case, at least 1 more byte).

It determines whether there's potentially any more input using the AtEof trait. This always returns false for &str and &[u8], as they don't provide any information about whether they're complete or not!

The trick is to change the input type of your parsers to make it explicit that the input will always be complete - Nom provides the CompleteStr and CompleteByteSlice wrappers for this purpose, or you can implement your own input type.

So in order for your parser to work as expected, it'd need to look something like this:

named!(my_u64(CompleteStr) -> u64,
    map_res!(recognize!(nom::digit), u64::from_str)
);

And your test would look something like this:

#[cfg(test)]
mod test {
    #[test]
    fn my_u64() {
        assert_eq!(Ok((CompleteStr(""), 0)), super::my_u64(CompleteStr("0")));
    }
}

See the announcement post for Nom 4 for more details.

like image 6
Joe Clay Avatar answered Nov 03 '22 05:11

Joe Clay