Optional field with strict format

Tags:

nom

I am trying to build nom parser to examine URLs with ID as UUID

rooms/e19c94cf-53eb-4048-9c94-7ae74ff6d912

I created the following:

extern crate uuid;
use uuid::Uuid;

named!(room_uuid<&str, Option<Uuid>>,
    do_parse!(
        tag_s!("rooms") >>
        id: opt!(complete!(preceded!(
            tag_s!("/"),
            map_res!(take_s!(36), FromStr::from_str)
        ))) >>

        (id)
    )
);

It handles almost all cases well:

assert_eq!(room_uuid("rooms"), Done("", None));
assert_eq!(room_uuid("rooms/"), Done("/", None));
assert_eq!(room_uuid("rooms/e19c94cf-53eb-4048-9c94-7ae74ff6d912"), Done("", Some(Uuid::parse_str("e19c94cf-53eb-4048-9c94-7ae74ff6d912").unwrap())));

Except cases where ID is not a valid UUID:

assert!(room_uuid("rooms/123").is_err()); # it fails
# room_uuid("rooms/123").to_result() => Ok(None)

As far as I understand it happens because opt! converts inner Err into None.

I would like to have ID as optional section but if it is present it should be a valid UUID.
Unfortunately, I don't understand how to combine both those things: optionality and strict format.

897

asked Feb 13 '18 20:02

2 Answers

I've only started working with nom myself in the last couple of weeks but I found one way of solving this. It doesn't fit exclusively within a macro but it does give the correct behavior with one modification. I swallow the / rather than leave it dangling after when a UUID is not given.

#[macro_use]
extern crate nom;
extern crate uuid;

use std::str::FromStr;
use nom::IResult;
use uuid::Uuid;

fn room_uuid(input: &str) -> IResult<&str, Option<Uuid>> {
    // Check that it starts with "rooms"
    let res = tag_s!(input, "rooms");
    let remaining = match res {
        IResult::Incomplete(i) => return IResult::Incomplete(i),
        IResult::Error(e) => return IResult::Error(e),
        IResult::Done(i, _) => i
    };

    // If a slash is not present, return early
    let optional_slash = opt!(remaining, tag_s!("/"));
    let remaining = match optional_slash {
        IResult::Error(_) |
        IResult::Incomplete(_) => return IResult::Done(remaining, None),
        IResult::Done(i, _) => i
    };

    // If something follows a slash, make sure
    // it's a valid UUID
    if remaining.len() > 0 {
        let res = complete!(remaining, map_res!(take_s!(36), FromStr::from_str));
        match res {
            IResult::Done(i, o) => IResult::Done(i, Some(o)),
            IResult::Error(e) => IResult::Error(e),
            IResult::Incomplete(n) => IResult::Incomplete(n)
        }
    } else {
        // This branch allows for "rooms/"
        IResult::Done(remaining, None)
    }
}

#[test]
fn match_room_plus_uuid() {
    use nom::IResult::*;

    assert_eq!(room_uuid("rooms"), Done("", None));
    assert_eq!(room_uuid("rooms/"), Done("", None));
    assert_eq!(room_uuid("rooms/e19c94cf-53eb-4048-9c94-7ae74ff6d912"), Done("", Some(Uuid::parse_str("e19c94cf-53eb-4048-9c94-7ae74ff6d912").unwrap())));
    assert!(room_uuid("rooms/123").is_err());
}

125

answered Nov 15 '22 08:11

Mike Cluck

Ok, so I got it working with nom and the extended URL format api/v1/rooms/UUID/tracks/UUID.

The basics are the same as before: you want to check for eof, ignore trailing "/" and never wait for incomplete results (alt_complete! is doing a good job here).

Regarding your ErrorKind::Verify wish: I don't think the error kind is actually important, just ignore it, or map it to whatever you want manually.

Be careful with the alt_complete! branches: in case of overlaps the preferred option (usually the "longer one") should come first.

I like my with! helper, but you could also inline it.

Playground doesn't support nom, so no link this time.

#[macro_use]
extern crate nom;

extern crate uuid;
use uuid::Uuid;

named!(uuid<&str, Uuid>, preceded!(
    tag_s!("/"),
    map_res!(take_s!(36), str::parse)
));

#[derive(Clone, PartialEq, Eq, Debug)]
enum ApiRequest {
    Rooms,
    Room { room: Uuid },
    Tracks { room: Uuid },
    Track { room: Uuid, track: Uuid },
}

/// shortcut for: `do_parse!(name: expr >> r: otherexpr >> (r))`
///
/// `otherexpr` should use `name`, otherwise you could just use `preceded!`.
macro_rules! with {
    ($i:expr, $var:ident: $submac:ident!( $($args:tt)* ) >> $($rest:tt)*) => {
        do_parse!($i, $var: $submac!($($args)*) >> r: $($rest)* >> (r));
    };
    ($i:expr, $var:ident: $submac:ident >> $($rest:tt)*) => {
        do_parse!($i, $var: $submac >> r: $($rest)* >> (r));
    };
}

// /api/v1/rooms/UUID/tracks/UUID
named!(apiv1<&str, ApiRequest>, preceded!(tag_s!("/api/v1"),
    alt_complete!(
        preceded!(tag_s!("/rooms"), alt_complete!(
            with!(room: uuid >> alt_complete!(
                preceded!(tag_s!("/tracks"), alt_complete!(
                    with!(track: uuid >> alt_complete!(
                        // ... sub track requests?
                        value!(ApiRequest::Track{room, track})
                    ))
                    |
                    value!(ApiRequest::Tracks{room})
                ))
                // other room requests
                |
                value!(ApiRequest::Room{room})
            ))
            |
            value!(ApiRequest::Rooms)
        ))
        // | ... other requests
    )
));

named!(api<&str, ApiRequest>, terminated!(
    alt_complete!(
        apiv1
        // | ... other versions
        // also could wrap in new enum like:
        //     apiv1 => { ApiRequest::V1 }
        //     |
        //     apiv2 => { ApiRequest::V2 }
    ),
    tuple!(
        alt_complete!(tag_s!("/") | value!("")), // ignore trailing "/"
        eof!() // make sure full URL was parsed
    )
));

fn main() {
    use nom::IResult::*;
    use nom::ErrorKind;

    let room = Uuid::parse_str("e19c94cf-53eb-4048-9c94-7ae74ff6d912").unwrap();
    let track = Uuid::parse_str("83d235e8-03cd-420d-a8c6-6e42440a5573").unwrap();

    assert_eq!(api("/api/v1/rooms"), Done("", ApiRequest::Rooms));
    assert_eq!(api("/api/v1/rooms/"), Done("", ApiRequest::Rooms));
    assert_eq!(
        api("/api/v1/rooms/e19c94cf-53eb-4048-9c94-7ae74ff6d912"),
        Done("", ApiRequest::Room { room })
    );
    assert_eq!(
        api("/api/v1/rooms/e19c94cf-53eb-4048-9c94-7ae74ff6d912/"),
        Done("", ApiRequest::Room { room })
    );
    assert_eq!(
        api("/api/v1/rooms/e19c94cf-53eb-4048-9c94-7ae74ff6d912/tracks"),
        Done("", ApiRequest::Tracks { room })
    );
    assert_eq!(
        api("/api/v1/rooms/e19c94cf-53eb-4048-9c94-7ae74ff6d912/tracks/"),
        Done("", ApiRequest::Tracks { room })
    );
    assert_eq!(
        api("/api/v1/rooms/e19c94cf-53eb-4048-9c94-7ae74ff6d912/tracks/83d235e8-03cd-420d-a8c6-6e42440a5573"),
        Done("", ApiRequest::Track{room, track})
    );
    assert_eq!(
        api("/api/v1/rooms/e19c94cf-53eb-4048-9c94-7ae74ff6d912/tracks/83d235e8-03cd-420d-a8c6-6e42440a5573/"),
        Done("", ApiRequest::Track{room, track})
    );
    assert_eq!(api("/api/v1"), Error(ErrorKind::Alt));
    assert_eq!(api("/api/v1/foo"), Error(ErrorKind::Alt));
    assert_eq!(api("/api/v1/rooms/123"), Error(ErrorKind::Eof));
    assert_eq!(
        api("/api/v1/rooms/e19c94cf-53eb-4048-9c94-7ae74ff6d912/bar"),
        Error(ErrorKind::Eof)
    );
    assert_eq!(
        api("/api/v1/rooms/e19c94cf-53eb-4048-9c94-7ae74ff6d912/tracks/83d235e8-03cd-420d-a8c6-6e42440a5573/123"),
        Error(ErrorKind::Eof)
    );
    assert_eq!(api("/api/v2"), Error(ErrorKind::Alt));
}

You could also use a more strict alt_full_opt_slash! branch method, which would ensure a branch only matches if it fully parsed the input.

You could then use a more "flat" way (although nested branches should still be working) to parse the alternatives (although this means you might end up parsing some UUIDs more than once; also now all errors are of kind Alt):

/// Similar to alt_complete, but also requires the branch parses until
/// the end of the input (but ignores a trailing "/").
macro_rules! alt_full_opt_slash {
    (__impl_push2 ($i:expr,) ($($new:tt)*), $($rest:tt)*) => {
        alt_full_opt_slash!(__impl ($i, $($new)*), $($rest)*)
    };
    (__impl_push2 ($i:expr, $($result:tt)+) ($($new:tt)*), $($rest:tt)*) => {
        alt_full_opt_slash!(__impl ($i, $($result)+ | $($new)*), $($rest)*)
    };
    (__impl_push ($($result:tt)*) ($($new:tt)*), $($rest:tt)*) => {
        // modify branch:
        alt_full_opt_slash!(__impl_push2 ($($result)*) (
            terminated!(
                $($new)*,
                tuple!(
                    alt_complete!(tag_s!("/") | value!("")), // ignore trailing "/"
                    eof!() // make sure full URL was parsed
                )
            )
        ), $($rest)*)
    };
    (__impl ($($result:tt)*), $e:ident | $($rest:tt)*) => {
        alt_full_opt_slash!(__impl_push ($($result)*) ( $e ), $($rest)*)
    };
    (__impl ($($result:tt)*), $subrule:ident!( $($args:tt)*) | $($rest:tt)*) => {
        alt_full_opt_slash!(__impl_push ($($result)*) ( $subrule!($($args)*) ), $($rest)*)
    };
    (__impl ($($result:tt)*), $subrule:ident!( $($args:tt)* ) => { $gen:expr } | $($rest:tt)*) => {
        alt_full_opt_slash!(__impl_push ($($result)*) ( $subrule!($($args)*) => { $gen } ), $($rest)*)
    };
    (__impl ($($result:tt)*), $e:ident => { $gen:expr } | $($rest:tt)*) => {
        alt_full_opt_slash!(__impl_push ($($result)*) ( $e => { $gen } ), $($rest)*)
    };
    (__impl ($i:expr, $($result:tt)*), __end) => {
        alt_complete!($i, $($result)*)
    };
    ($i:expr, $($rest:tt)*) => {{
        alt_full_opt_slash!(__impl ($i, ), $($rest)* | __end)
    }};
}

// /api/v1/rooms/UUID/tracks/UUID
named!(apiv1<&str, ApiRequest>, preceded!(tag_s!("/api/v1"),
    alt_full_opt_slash!(
        do_parse!(
            tag_s!("/rooms") >>
            (ApiRequest::Rooms)
        )
        |
        do_parse!(
            tag_s!("/rooms") >>
            room: uuid >>
            (ApiRequest::Room{room})
        )
        |
        do_parse!(
            tag_s!("/rooms") >>
            room: uuid >>
            tag_s!("/tracks") >>
            (ApiRequest::Tracks{room})
        )
        |
        do_parse!(
            tag_s!("/rooms") >>
            room: uuid >>
            tag_s!("/tracks") >>
            track: uuid >>
            (ApiRequest::Track{room, track})
        )
    )
));

named!(api<&str, ApiRequest>, alt_complete!(
    apiv1
    // | ... other versions
    // also could wrap in new enum like:
    //     apiv1 => { ApiRequest::V1 }
    //     |
    //     apiv2 => { ApiRequest::V2 }
));

answered Nov 15 '22 09:11

Stefan

Related questions
                            
                                How to define some macros as 'private' to a module, when using `macro_use` in Rust?
                            
                                `if` condition remains borrowed in body [duplicate]
                            
                                Rust code cannot link with a C library compiled on Windows because there is an unresolved external symbol
                            
                                How do I implement a container with support for a mutable iterator? [duplicate]
                            
                                How to cleanly break tokio-core event loop and futures::Stream in Rust
                            
                                How do I return an iterator that has a reference to something inside a RefCell?
                            
                                Using DTrace to get stack traces / profiling data on Rust
                            
                                Why is `Future::poll` not called repeatedly after returning `NotReady`?
                            
                                Implementing only IndexMut without implementing Index
                            
                                Kcov is reporting 100% for Rust lib even though some methods are not covered
                            
                                Why does the Fuse iterator adapter not work as expected?
                            
                                Wrong inferred lifetime due to associated type
                            
                                LLVM produced by rustc gives error about argument type of main when run with lli
                            
                                Is it expected that a too large bitshift is undefined behavior in Rust?
                            
                                How do I create a streaming parser in nom?
                            
                                Deserializing newline-delimited JSON from a socket using Serde
                            
                                Make String type compatible with ARM?
                            
                                How can I simultaneously iterate over a Rust HashMap and modify some of its values?
                            
                                Lifetime error when creating a function that returns a value implementing serde::Deserialize
                            
                                How to report errors in a procedural macro using the quote macro?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Optional field with strict format

Tags:

rust

nom

Aleksey

People also ask

2 Answers

Mike Cluck

Stefan

Recent Activity

Donate For Us