I am trying to build nom parser to examine URLs with ID as UUID
rooms/e19c94cf-53eb-4048-9c94-7ae74ff6d912
I created the following:
extern crate uuid;
use uuid::Uuid;
named!(room_uuid<&str, Option<Uuid>>,
do_parse!(
tag_s!("rooms") >>
id: opt!(complete!(preceded!(
tag_s!("/"),
map_res!(take_s!(36), FromStr::from_str)
))) >>
(id)
)
);
It handles almost all cases well:
assert_eq!(room_uuid("rooms"), Done("", None));
assert_eq!(room_uuid("rooms/"), Done("/", None));
assert_eq!(room_uuid("rooms/e19c94cf-53eb-4048-9c94-7ae74ff6d912"), Done("", Some(Uuid::parse_str("e19c94cf-53eb-4048-9c94-7ae74ff6d912").unwrap())));
Except cases where ID is not a valid UUID:
assert!(room_uuid("rooms/123").is_err()); # it fails
# room_uuid("rooms/123").to_result() => Ok(None)
As far as I understand it happens because opt!
converts inner Err
into None
.
I would like to have ID as optional section but if it is present it should be a valid UUID.
Unfortunately, I don't understand how to combine both those things: optionality and strict format.
To use them, you need to prepend strict_ to the name of the date format, for instance strict_date_optional_time instead of date_optional_time. These strict date formats are especially useful when date fields are dynamically mapped in order to make sure to not accidentally map irrelevant strings as dates.
In the core field list for the Fields objective, all fields are optional. The deduplicate asset identifies the current index key as a manadatory field.
Defaults to strict_date_optional_time||epoch_millis . The locale to use when parsing dates since months do not have the same names and/or abbreviations in all languages. The default is the ROOT locale , If true, malformed numbers are ignored. If false (default), malformed numbers throw an exception and reject the whole document.
Yes, there are exceptions, sometimes you have to make a form with 40 fields : 20 required, 20 optional. What to do in this situation? Find the courage to question some things, stop telling you it’s normal, because everyone is doing it.
I've only started working with nom myself in the last couple of weeks but I found one way of solving this. It doesn't fit exclusively within a macro but it does give the correct behavior with one modification. I swallow the /
rather than leave it dangling after when a UUID is not given.
#[macro_use]
extern crate nom;
extern crate uuid;
use std::str::FromStr;
use nom::IResult;
use uuid::Uuid;
fn room_uuid(input: &str) -> IResult<&str, Option<Uuid>> {
// Check that it starts with "rooms"
let res = tag_s!(input, "rooms");
let remaining = match res {
IResult::Incomplete(i) => return IResult::Incomplete(i),
IResult::Error(e) => return IResult::Error(e),
IResult::Done(i, _) => i
};
// If a slash is not present, return early
let optional_slash = opt!(remaining, tag_s!("/"));
let remaining = match optional_slash {
IResult::Error(_) |
IResult::Incomplete(_) => return IResult::Done(remaining, None),
IResult::Done(i, _) => i
};
// If something follows a slash, make sure
// it's a valid UUID
if remaining.len() > 0 {
let res = complete!(remaining, map_res!(take_s!(36), FromStr::from_str));
match res {
IResult::Done(i, o) => IResult::Done(i, Some(o)),
IResult::Error(e) => IResult::Error(e),
IResult::Incomplete(n) => IResult::Incomplete(n)
}
} else {
// This branch allows for "rooms/"
IResult::Done(remaining, None)
}
}
#[test]
fn match_room_plus_uuid() {
use nom::IResult::*;
assert_eq!(room_uuid("rooms"), Done("", None));
assert_eq!(room_uuid("rooms/"), Done("", None));
assert_eq!(room_uuid("rooms/e19c94cf-53eb-4048-9c94-7ae74ff6d912"), Done("", Some(Uuid::parse_str("e19c94cf-53eb-4048-9c94-7ae74ff6d912").unwrap())));
assert!(room_uuid("rooms/123").is_err());
}
Ok, so I got it working with nom
and the extended URL format api/v1/rooms/UUID/tracks/UUID
.
The basics are the same as before: you want to check for eof
, ignore trailing "/"
and never wait for incomplete results (alt_complete!
is doing a good job here).
Regarding your ErrorKind::Verify
wish: I don't think the error kind is actually important, just ignore it, or map it to whatever you want manually.
Be careful with the alt_complete!
branches: in case of overlaps the preferred option (usually the "longer one") should come first.
I like my with!
helper, but you could also inline it.
Playground doesn't support nom
, so no link this time.
#[macro_use]
extern crate nom;
extern crate uuid;
use uuid::Uuid;
named!(uuid<&str, Uuid>, preceded!(
tag_s!("/"),
map_res!(take_s!(36), str::parse)
));
#[derive(Clone, PartialEq, Eq, Debug)]
enum ApiRequest {
Rooms,
Room { room: Uuid },
Tracks { room: Uuid },
Track { room: Uuid, track: Uuid },
}
/// shortcut for: `do_parse!(name: expr >> r: otherexpr >> (r))`
///
/// `otherexpr` should use `name`, otherwise you could just use `preceded!`.
macro_rules! with {
($i:expr, $var:ident: $submac:ident!( $($args:tt)* ) >> $($rest:tt)*) => {
do_parse!($i, $var: $submac!($($args)*) >> r: $($rest)* >> (r));
};
($i:expr, $var:ident: $submac:ident >> $($rest:tt)*) => {
do_parse!($i, $var: $submac >> r: $($rest)* >> (r));
};
}
// /api/v1/rooms/UUID/tracks/UUID
named!(apiv1<&str, ApiRequest>, preceded!(tag_s!("/api/v1"),
alt_complete!(
preceded!(tag_s!("/rooms"), alt_complete!(
with!(room: uuid >> alt_complete!(
preceded!(tag_s!("/tracks"), alt_complete!(
with!(track: uuid >> alt_complete!(
// ... sub track requests?
value!(ApiRequest::Track{room, track})
))
|
value!(ApiRequest::Tracks{room})
))
// other room requests
|
value!(ApiRequest::Room{room})
))
|
value!(ApiRequest::Rooms)
))
// | ... other requests
)
));
named!(api<&str, ApiRequest>, terminated!(
alt_complete!(
apiv1
// | ... other versions
// also could wrap in new enum like:
// apiv1 => { ApiRequest::V1 }
// |
// apiv2 => { ApiRequest::V2 }
),
tuple!(
alt_complete!(tag_s!("/") | value!("")), // ignore trailing "/"
eof!() // make sure full URL was parsed
)
));
fn main() {
use nom::IResult::*;
use nom::ErrorKind;
let room = Uuid::parse_str("e19c94cf-53eb-4048-9c94-7ae74ff6d912").unwrap();
let track = Uuid::parse_str("83d235e8-03cd-420d-a8c6-6e42440a5573").unwrap();
assert_eq!(api("/api/v1/rooms"), Done("", ApiRequest::Rooms));
assert_eq!(api("/api/v1/rooms/"), Done("", ApiRequest::Rooms));
assert_eq!(
api("/api/v1/rooms/e19c94cf-53eb-4048-9c94-7ae74ff6d912"),
Done("", ApiRequest::Room { room })
);
assert_eq!(
api("/api/v1/rooms/e19c94cf-53eb-4048-9c94-7ae74ff6d912/"),
Done("", ApiRequest::Room { room })
);
assert_eq!(
api("/api/v1/rooms/e19c94cf-53eb-4048-9c94-7ae74ff6d912/tracks"),
Done("", ApiRequest::Tracks { room })
);
assert_eq!(
api("/api/v1/rooms/e19c94cf-53eb-4048-9c94-7ae74ff6d912/tracks/"),
Done("", ApiRequest::Tracks { room })
);
assert_eq!(
api("/api/v1/rooms/e19c94cf-53eb-4048-9c94-7ae74ff6d912/tracks/83d235e8-03cd-420d-a8c6-6e42440a5573"),
Done("", ApiRequest::Track{room, track})
);
assert_eq!(
api("/api/v1/rooms/e19c94cf-53eb-4048-9c94-7ae74ff6d912/tracks/83d235e8-03cd-420d-a8c6-6e42440a5573/"),
Done("", ApiRequest::Track{room, track})
);
assert_eq!(api("/api/v1"), Error(ErrorKind::Alt));
assert_eq!(api("/api/v1/foo"), Error(ErrorKind::Alt));
assert_eq!(api("/api/v1/rooms/123"), Error(ErrorKind::Eof));
assert_eq!(
api("/api/v1/rooms/e19c94cf-53eb-4048-9c94-7ae74ff6d912/bar"),
Error(ErrorKind::Eof)
);
assert_eq!(
api("/api/v1/rooms/e19c94cf-53eb-4048-9c94-7ae74ff6d912/tracks/83d235e8-03cd-420d-a8c6-6e42440a5573/123"),
Error(ErrorKind::Eof)
);
assert_eq!(api("/api/v2"), Error(ErrorKind::Alt));
}
You could also use a more strict alt_full_opt_slash!
branch method, which would ensure a branch only matches if it fully parsed the input.
You could then use a more "flat" way (although nested branches should still be working) to parse the alternatives (although this means you might end up parsing some UUIDs more than once; also now all errors are of kind Alt
):
/// Similar to alt_complete, but also requires the branch parses until
/// the end of the input (but ignores a trailing "/").
macro_rules! alt_full_opt_slash {
(__impl_push2 ($i:expr,) ($($new:tt)*), $($rest:tt)*) => {
alt_full_opt_slash!(__impl ($i, $($new)*), $($rest)*)
};
(__impl_push2 ($i:expr, $($result:tt)+) ($($new:tt)*), $($rest:tt)*) => {
alt_full_opt_slash!(__impl ($i, $($result)+ | $($new)*), $($rest)*)
};
(__impl_push ($($result:tt)*) ($($new:tt)*), $($rest:tt)*) => {
// modify branch:
alt_full_opt_slash!(__impl_push2 ($($result)*) (
terminated!(
$($new)*,
tuple!(
alt_complete!(tag_s!("/") | value!("")), // ignore trailing "/"
eof!() // make sure full URL was parsed
)
)
), $($rest)*)
};
(__impl ($($result:tt)*), $e:ident | $($rest:tt)*) => {
alt_full_opt_slash!(__impl_push ($($result)*) ( $e ), $($rest)*)
};
(__impl ($($result:tt)*), $subrule:ident!( $($args:tt)*) | $($rest:tt)*) => {
alt_full_opt_slash!(__impl_push ($($result)*) ( $subrule!($($args)*) ), $($rest)*)
};
(__impl ($($result:tt)*), $subrule:ident!( $($args:tt)* ) => { $gen:expr } | $($rest:tt)*) => {
alt_full_opt_slash!(__impl_push ($($result)*) ( $subrule!($($args)*) => { $gen } ), $($rest)*)
};
(__impl ($($result:tt)*), $e:ident => { $gen:expr } | $($rest:tt)*) => {
alt_full_opt_slash!(__impl_push ($($result)*) ( $e => { $gen } ), $($rest)*)
};
(__impl ($i:expr, $($result:tt)*), __end) => {
alt_complete!($i, $($result)*)
};
($i:expr, $($rest:tt)*) => {{
alt_full_opt_slash!(__impl ($i, ), $($rest)* | __end)
}};
}
// /api/v1/rooms/UUID/tracks/UUID
named!(apiv1<&str, ApiRequest>, preceded!(tag_s!("/api/v1"),
alt_full_opt_slash!(
do_parse!(
tag_s!("/rooms") >>
(ApiRequest::Rooms)
)
|
do_parse!(
tag_s!("/rooms") >>
room: uuid >>
(ApiRequest::Room{room})
)
|
do_parse!(
tag_s!("/rooms") >>
room: uuid >>
tag_s!("/tracks") >>
(ApiRequest::Tracks{room})
)
|
do_parse!(
tag_s!("/rooms") >>
room: uuid >>
tag_s!("/tracks") >>
track: uuid >>
(ApiRequest::Track{room, track})
)
)
));
named!(api<&str, ApiRequest>, alt_complete!(
apiv1
// | ... other versions
// also could wrap in new enum like:
// apiv1 => { ApiRequest::V1 }
// |
// apiv2 => { ApiRequest::V2 }
));
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With