Can a slice iterator be advanced more than one element in constant time?

Question

Some example code: (playpen)

let data = [0, 1, 2, 3, 4];
let mut iter = data.iter();
println!("{}", iter.next().unwrap());
println!("{}", iter.skip(3).next().unwrap());

This prints 0 and 4, as expected.

I'm curious if the skip operation is constant-time for the slice iterator? Grepping in the source I only found this implementation for skip, which leads down to the Skip struct's Iterator implementation.

That appears to be a generic O(n) skip, and I can't see any specialization for pointer-based iterators that could just do pointer arithmetic.

Am I missing something about the implementation of skip? Or is there some other way to do this?

huon · Accepted Answer

There's no fast-forward built-in to std yet.

As Chris demonstrates, it can be implemented with skip and this sometimes optimises to O(1). Unfortunately, the optimisation doesn't always happen, my experimentation found that a function like

pub fn foo(xs: &[u32], x: usize) -> u32 {
    *xs.iter().skip(x).next().unwrap()
}

optimises (opt-level 3) to

.LBB6_2:
    pushq   %rax
.Ltmp10:
    .cfi_def_cfa_offset 16
    movq    (%rdi), %rdx
    movq    8(%rdi), %rdi
    xorl    %eax, %eax
    testq   %rdi, %rdi
    movq    %rdx, %rcx
    je  .LBB6_4
    leaq    4(%rdx), %rcx
    movq    %rdx, %rax
.LBB6_4:
    testq   %rsi, %rsi
    je  .LBB6_9
    leaq    (%rdx,%rdi,4), %rdx
    .align  16, 0x90
.LBB6_6:
    testq   %rax, %rax
    je  .LBB6_12
    decq    %rsi
    cmpq    %rdx, %rcx
    movq    %rdx, %rdi
    movl    $0, %eax
    je  .LBB6_8
    leaq    4(%rcx), %rdi
    movq    %rcx, %rax
.LBB6_8:
    testq   %rsi, %rsi
    movq    %rdi, %rcx
    jne .LBB6_6
.LBB6_9:
    testq   %rax, %rax
    je  .LBB6_12
    movl    (%rax), %eax
    popq    %rdx
    retq
.LBB6_12:
    movq    _ZN6option15Option$LT$T$GT$6unwrap14_MSG_FILE_LINE20ha41302a4e895de223qFE@GOTPCREL(%rip), %rdi
    callq   _ZN9panicking5panic20h90c2ad20c9dac62bKRCE@PLT

Of particular note is the .LBB6_6: ... jne .LBB6_6 sequence: it's a loop.

Fortunately there's a at least one way out of this, and it has an additional useful property: it doesn't require changing the type, and so can be used directly in place.

The slice iterator can be converted back into the slice it represents with as_slice: Iter<T> and &[T] are actually isomorphic, they differ mainly because for optimisations reasons. Once we have the slice, we can slice and dice it to obtain a shorter region of memory, and then create an iterator over only those elements. The lifetimes all work out, and we're left with exactly the same type just without a few elements.

use std::slice::Iter;
use std::cmp;

pub fn skip(iter: &mut Iter<u32>, x: usize) {
    let s = iter.as_slice();
    *iter = s[cmp::min(x, s.len())..].iter();
}

Used like skip(&mut some_iter, 10).

The min call is to replicate the behaviour of Iterator::skip and to avoid panics (skipping more elements than the iterator contains will just result in the 'return' value being empty).

To see it in practice, consider foo converted to use the new skip:

pub fn foo(xs: &[u32], x: usize) -> u32 {
    let mut iter = xs.iter();
    skip(&mut iter, x);
    *iter.next().unwrap()
}

It optimises to:

.LBB7_2:
    pushq   %rax
.Ltmp12:
    .cfi_def_cfa_offset 16
    movq    8(%rdi), %rax
    cmpq    %rsi, %rax
    cmovbq  %rax, %rsi
    cmpq    %rax, %rsi
    je  .LBB7_4
    movq    (%rdi), %rax
    movl    (%rax,%rsi,4), %eax
    popq    %rdx
    retq
.LBB7_4:
    movq    _ZN6option15Option$LT$T$GT$6unwrap14_MSG_FILE_LINE20ha41302a4e895de223qFE@GOTPCREL(%rip), %rdi
    callq   _ZN9panicking5panic20h90c2ad20c9dac62bKRCE@PLT

Notably, no loops. It's not quite as short as the xs[x] implementation (asm below, for reference), but it's pretty close (2 extra instructions).

.LBB5_2:
    pushq   %rax
.Ltmp8:
    .cfi_def_cfa_offset 16
    movq    8(%rdi), %rdx
    cmpq    %rsi, %rdx
    jbe .LBB5_4
    movq    (%rdi), %rax
    movl    (%rax,%rsi,4), %eax
    popq    %rdx
    retq
.LBB5_4:
    leaq    panic_bounds_check_loc1464(%rip), %rdi
    callq   _ZN9panicking18panic_bounds_check20h5ef74c98f9f69401jSCE@PLT

(In fact, I'd almost consider the difference an LLVM bug: it seems like it could do much a better job with the two cmps and a cmovbq.)

It is nice that it optimises pretty well, but, as the problem with the Iterator::skip method demonstrates, this can't be relied on. However, the as_slice approach is O(1) regardless of optimisation level.

I suspect slice::Iter could override the skip method to do a fast-forward, and then return Skip { iter: self, n: 0 }, thus guaranteeing that skip on Iter is actually efficient. But this (like the above) feels like a bit of a hack, and, still causes the type to change so can't be used in-place.

Chris Morgan · Answer

There is nothing that makes this possible in the library, but sometimes the unstable RandomAccessIterator.idx trait method may do what you want.

Examining the assembly code produced indicates that the compiler can optimise at least some skips from O(n) down to O(1). As a trivial example, given x: &[u32], *x.iter().skip(5).next().unwrap() and x[5] produce the same assembly code. I’m not sure how thorough it will be in optimising skips, but it will definitely not be shabby. That’s one of the nice ideas of optimising compilers: such specialisations as you seek can be implemented in the compiler rather than in the code, which will normally help you to avoid missing an optimisation where it can be done, but (because they’re not perfect) can occasionally cause a missed optimisation which you expect to happen.

Can a slice iterator be advanced more than one element in constant time?

Tags:

iterator

rust

Nicholas Bishop

2 Answers

huon

Chris Morgan

Recent Activity

Donate For Us

Can a slice iterator be advanced more than one element in constant time?

Tags:

iterator

rust

Nicholas Bishop

2 Answers

huon

Chris Morgan

Related questions

Recent Activity

Donate For Us