Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to get subslices with neither panicking nor unsafe?

Tags:

rust

Note: How to get subslices is close, but without either constraint.

I am looking for the best implementation possible of a function with this signature:

fn sub(slice: &[u8], range: Range<usize>) -> Option<&[u8]>;

which would have the same functionality as the impl Index<Range<usize>> but replacing the failure case with None instead of panicking.

The twists are the seemingly unusual constraints that this function should not:

  • invoke functions that may panic
  • use unsafe code

A careful investigation of the slice interface has brought to my attention:

  • split_first and split_last
  • split_at
  • impl Index<Range<usize>> (and variants)
  • as_ptr and from_raw_parts

However most are not suitable:

  • split_at and Index<Range<usize>>::index may panic
  • from_raw_parts is unsafe

Leaving me to implement sub as:

fn sub(slice: &[u8], range: Range<usize>) -> Option<&[u8]> {
    if range.start > range.end { return None; }

    let length = range.end - range.start;

    if length > slice.len() { return None; }

    if length == 0 { return Some(b""); }

    let mut slice = slice;
    for _ in 0..range.start {
        if let Some((_, s)) = slice.split_first() {
            slice = s;
        }
    }

    for _ in 0..(slice.len() - length) {
        if let Some((_, s)) = slice.split_last() {
            slice = s;
        }
    }

    Some(slice)
}

which while seemingly working has O(N) complexity; rather unsatisfying.

Is there any way to do better?

like image 567
Matthieu M. Avatar asked Oct 23 '16 14:10

Matthieu M.


2 Answers

Now you can just use the get method on the slice.

I copied the code below from The document of that method.

let v = [10, 40, 30];
assert_eq!(Some(&40), v.get(1));
assert_eq!(Some(&[10, 40][..]), v.get(0..2));
assert_eq!(None, v.get(3));
assert_eq!(None, v.get(0..4));
like image 145
Benpigchu Avatar answered Nov 14 '22 20:11

Benpigchu


I'll start with the version that I prefer. It is get_slice, it uses bounds checked slicing, and you can look at the compiler's optimized output to verify that the compiler knows it will never panic. (1. I agree, “no panic” would be a terrific thing to be able to work with as an assertion; 2. get_slice would be a nice addition to libstd and indeed is planned.)

/// Return the subslice corresponding to `range`, if it is in bounds
/// and well formed (start <= end).
pub fn get_slice(slice: &[u8], range: Range<usize>) -> Option<&[u8]> {
    if range.start > range.end || range.end > slice.len() {
        None
    } else {
        Some(&slice[range])
    }
}

Next is an attempted solution, that is still coded with what appears to be an O(N) algorithm, but is strength reduced to O(1) by the optimizer.

We use the slice iterator and the fact that we can convert it back to a slice. The slice iterator's .nth() is open coded to jump forward in constant time, but the reverse version is unfortunately not. However its loop is optimized out.

pub fn sub(slice: &[u8], range: Range<usize>) -> Option<&[u8]> {
    if range.start > range.end || range.end > slice.len() {
        return None;
    }

    let mut iter = slice.iter();
    let take_front = range.start; 
    let take_back = slice.len() - range.end; 
    if take_front > 0 {
        iter.nth(take_front - 1);
    }
    if take_back > 0 {
        (&mut iter).rev().nth(take_back - 1);
    }
    Some(iter.as_slice())
}

playground link

Note: I unfortunately think we are making a bit of arbitrary rules here. We could use .chunks() after the take front operation, and it would give you a direct O(1) solution. However, chunks may panic if you ask for 0-size chunks. That puts it in the same bracket at just using bounds checked slicing for me (“it may panic”).

like image 41
bluss Avatar answered Nov 14 '22 21:11

bluss