Some example code: (playpen)
let data = [0, 1, 2, 3, 4];
let mut iter = data.iter();
println!("{}", iter.next().unwrap());
println!("{}", iter.skip(3).next().unwrap());
This prints 0 and 4, as expected.
I'm curious if the skip operation is constant-time for the slice iterator? Grepping in the source I only found this implementation for skip, which leads down to the Skip struct's Iterator implementation.
That appears to be a generic O(n) skip, and I can't see any specialization for pointer-based iterators that could just do pointer arithmetic.
Am I missing something about the implementation of skip? Or is there some other way to do this?
There's no fast-forward built-in to std yet.
As Chris demonstrates, it can be implemented with skip and this sometimes optimises to O(1). Unfortunately, the optimisation doesn't always happen, my experimentation found that a function like
pub fn foo(xs: &[u32], x: usize) -> u32 {
*xs.iter().skip(x).next().unwrap()
}
optimises (opt-level 3) to
.LBB6_2:
pushq %rax
.Ltmp10:
.cfi_def_cfa_offset 16
movq (%rdi), %rdx
movq 8(%rdi), %rdi
xorl %eax, %eax
testq %rdi, %rdi
movq %rdx, %rcx
je .LBB6_4
leaq 4(%rdx), %rcx
movq %rdx, %rax
.LBB6_4:
testq %rsi, %rsi
je .LBB6_9
leaq (%rdx,%rdi,4), %rdx
.align 16, 0x90
.LBB6_6:
testq %rax, %rax
je .LBB6_12
decq %rsi
cmpq %rdx, %rcx
movq %rdx, %rdi
movl $0, %eax
je .LBB6_8
leaq 4(%rcx), %rdi
movq %rcx, %rax
.LBB6_8:
testq %rsi, %rsi
movq %rdi, %rcx
jne .LBB6_6
.LBB6_9:
testq %rax, %rax
je .LBB6_12
movl (%rax), %eax
popq %rdx
retq
.LBB6_12:
movq _ZN6option15Option$LT$T$GT$6unwrap14_MSG_FILE_LINE20ha41302a4e895de223qFE@GOTPCREL(%rip), %rdi
callq _ZN9panicking5panic20h90c2ad20c9dac62bKRCE@PLT
Of particular note is the .LBB6_6: ... jne .LBB6_6 sequence: it's a loop.
Fortunately there's a at least one way out of this, and it has an additional useful property: it doesn't require changing the type, and so can be used directly in place.
The slice iterator can be converted back into the slice it represents with as_slice: Iter<T> and &[T] are actually isomorphic, they differ mainly because for optimisations reasons. Once we have the slice, we can slice and dice it to obtain a shorter region of memory, and then create an iterator over only those elements. The lifetimes all work out, and we're left with exactly the same type just without a few elements.
use std::slice::Iter;
use std::cmp;
pub fn skip(iter: &mut Iter<u32>, x: usize) {
let s = iter.as_slice();
*iter = s[cmp::min(x, s.len())..].iter();
}
Used like skip(&mut some_iter, 10).
The min call is to replicate the behaviour of Iterator::skip and to avoid panics (skipping more elements than the iterator contains will just result in the 'return' value being empty).
To see it in practice, consider foo converted to use the new skip:
pub fn foo(xs: &[u32], x: usize) -> u32 {
let mut iter = xs.iter();
skip(&mut iter, x);
*iter.next().unwrap()
}
It optimises to:
.LBB7_2:
pushq %rax
.Ltmp12:
.cfi_def_cfa_offset 16
movq 8(%rdi), %rax
cmpq %rsi, %rax
cmovbq %rax, %rsi
cmpq %rax, %rsi
je .LBB7_4
movq (%rdi), %rax
movl (%rax,%rsi,4), %eax
popq %rdx
retq
.LBB7_4:
movq _ZN6option15Option$LT$T$GT$6unwrap14_MSG_FILE_LINE20ha41302a4e895de223qFE@GOTPCREL(%rip), %rdi
callq _ZN9panicking5panic20h90c2ad20c9dac62bKRCE@PLT
Notably, no loops. It's not quite as short as the xs[x] implementation (asm below, for reference), but it's pretty close (2 extra instructions).
.LBB5_2:
pushq %rax
.Ltmp8:
.cfi_def_cfa_offset 16
movq 8(%rdi), %rdx
cmpq %rsi, %rdx
jbe .LBB5_4
movq (%rdi), %rax
movl (%rax,%rsi,4), %eax
popq %rdx
retq
.LBB5_4:
leaq panic_bounds_check_loc1464(%rip), %rdi
callq _ZN9panicking18panic_bounds_check20h5ef74c98f9f69401jSCE@PLT
(In fact, I'd almost consider the difference an LLVM bug: it seems like it could do much a better job with the two cmps and a cmovbq.)
It is nice that it optimises pretty well, but, as the problem with the Iterator::skip method demonstrates, this can't be relied on. However, the as_slice approach is O(1) regardless of optimisation level.
I suspect slice::Iter could override the skip method to do a fast-forward, and then return Skip { iter: self, n: 0 }, thus guaranteeing that skip on Iter is actually efficient. But this (like the above) feels like a bit of a hack, and, still causes the type to change so can't be used in-place.
There is nothing that makes this possible in the library, but sometimes the unstable RandomAccessIterator.idx trait method may do what you want.
Examining the assembly code produced indicates that the compiler can optimise at least some skips from O(n) down to O(1). As a trivial example, given x: &[u32], *x.iter().skip(5).next().unwrap() and x[5] produce the same assembly code. I’m not sure how thorough it will be in optimising skips, but it will definitely not be shabby. That’s one of the nice ideas of optimising compilers: such specialisations as you seek can be implemented in the compiler rather than in the code, which will normally help you to avoid missing an optimisation where it can be done, but (because they’re not perfect) can occasionally cause a missed optimisation which you expect to happen.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With