Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I find a subsequence in a &[u8] slice?

Tags:

rust

I have a &[u8] slice over a binary buffer. I need to parse it, but a lot of the methods that I would like to use (such as str::find) don't seem to be available on slices.

I've seen that I can covert both by buffer slice and my pattern to str by using from_utf8_unchecked() but that seems a little dangerous (and also really hacky).

How can I find a subsequence in this slice? I actually need the index of the pattern, not just a slice view of the parts, so I don't think split will work.

like image 674
JasonN Avatar asked Mar 09 '16 20:03

JasonN


People also ask

How do you find the subsequence of a sequence?

A subsequence is just some of the terms of the original sequence, kept in order. If your original sequence is 1,2,3,4,5,6…, one subsequence is 1,3,5,7,9… Another is 1,2343,23565848,8685845855858,… Finding a subsequence is easy. Finding one that does what you want depends on what you want.

How do you find the subsequence of an array?

You can solve this question by iterating through the main input array once. Iterate through the main array, and look for the first integer in the potential subsequence. If you find that integer, keep on iterating through the main array, but now look for the second integer in the potential subsequence.

How do you find the subsequences of length k?

Given an array arr[]and an integer K, the task is to find the sum of all K length subsequences from the given array. Explanation: There are 6 subsequences of length 2 which are {7, 8}, {7, 9}, {7, 2}, {8, 9}, {8, 2} and {9, 2}.


2 Answers

Here's a simple implementation based on the windows iterator.

fn find_subsequence(haystack: &[u8], needle: &[u8]) -> Option<usize> {     haystack.windows(needle.len()).position(|window| window == needle) }  fn main() {     assert_eq!(find_subsequence(b"qwertyuiop", b"tyu"), Some(4));     assert_eq!(find_subsequence(b"qwertyuiop", b"asd"), None); } 

The find_subsequence function can also be made generic:

fn find_subsequence<T>(haystack: &[T], needle: &[T]) -> Option<usize>     where for<'a> &'a [T]: PartialEq {     haystack.windows(needle.len()).position(|window| window == needle) } 
like image 116
Francis Gagné Avatar answered Oct 15 '22 01:10

Francis Gagné


I don't think the standard library contains a function for this. Some libcs have memmem, but at the moment the libc crate does not wrap this. You can use the twoway crate however. rust-bio implements some pattern matching algorithms, too. All of those should be faster than using haystack.windows(..).position(..)

like image 24
aseyboldt Avatar answered Oct 14 '22 23:10

aseyboldt