Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I create a streaming parser in nom?

Tags:

rust

nom

I've created a few non-trivial parsers in nom, so I'm pretty familiar with it at this point. All the parsers I've created until now always provide the entire input slice to the parser.

I'd like to create a streaming parser, which I assume means that I can continue to feed bytes into the parser until it is complete. I've had a hard time finding any documentation or examples that illustrate this, and I also question my assumption of what a "streaming parser" is.

My questions are:

  • Is my understanding of what a streaming parser is correct?
  • If so, are there any good examples of a parser using this technique?
like image 291
w.brian Avatar asked Oct 22 '17 17:10

w.brian


Video Answer


1 Answers

nom parsers neither maintain a buffer to feed more data into, nor do they maintain "state" where they previously needed more bytes.

But if you take a look at the IResult structure you see that you can return a partial result or indicate that you need more data.

There seem to be some structures provided to handle streaming: I think you are supposed to create a Consumer from a parser using the consumer_from_parser! macro, implement a Producer for your data source, and call run until it returns None (and start again when you have more data). Examples and docs seem to be mostly missing so far - see bottom of https://github.com/Geal/nom :)

Also it looks like most functions and macros in nom are not documented well (or at all) regarding their behavior when hitting the end of the input. For example take_until! returns Incomplete if the input isn't long enough to contain the substr to look for, but returns an error if the input is long enough but doesn't contain substr.

Also nom mostly uses either &[u8] or &str for input; you can't signal an actual "end of stream" through these types. You could implement your own input type (related traits: nom::{AsBytes,Compare,FindSubstring,FindToken,InputIter,InputLength,InputTake,Offset,ParseTo,Slice}) to add a "reached end of stream" flag, but the nom provided macros and functions won't be able to interpret it.

All in all I'd recommend splitting streamed input through some other means into chunks you can handle with simple non-streaming parsers (maybe even use synom instead of nom).

like image 121
Stefan Avatar answered Sep 17 '22 22:09

Stefan