How can I find a subsequence in a &[u8] slice?



I have a &[u8] slice over a binary buffer. I need to parse it, but a lot of the methods that I would like to use (such as str::find) don't seem to be available on slices.

I've seen that I can covert both by buffer slice and my pattern to str by using from_utf8_unchecked() but that seems a little dangerous (and also really hacky).

How can I find a subsequence in this slice? I actually need the index of the pattern, not just a slice view of the parts, so I don't think split will work.

2 Answers

Here's a simple implementation based on the windows iterator.

fn find_subsequence(haystack: &[u8], needle: &[u8]) -> Option<usize> {     haystack.windows(needle.len()).position(|window| window == needle) }  fn main() {     assert_eq!(find_subsequence(b"qwertyuiop", b"tyu"), Some(4));     assert_eq!(find_subsequence(b"qwertyuiop", b"asd"), None); } 

The find_subsequence function can also be made generic:

fn find_subsequence<T>(haystack: &[T], needle: &[T]) -> Option<usize>     where for<'a> &'a [T]: PartialEq {     haystack.windows(needle.len()).position(|window| window == needle) } 
I don't think the standard library contains a function for this. Some libcs have memmem, but at the moment the libc crate does not wrap this. You can use the twoway crate however. rust-bio implements some pattern matching algorithms, too. All of those should be faster than using haystack.windows(..).position(..)

