It can be useful to iterate over multiple variables at once, overlapping (slice::windows
), or not (slice::chunks
).
This only works for slices; is it possible to do this for iterators, using tuples for convenience?
Something like the following could be written:
for (prev, next) in some_iter.windows(2) {
...
}
If not, could it be implemented as a trait on existing iterators?
It's possible to take chunks of an iterator using Itertools::tuples
, up to a 4-tuple:
use itertools::Itertools; // 0.9.0
fn main() {
let some_iter = vec![1, 2, 3, 4, 5, 6].into_iter();
for (prev, next) in some_iter.tuples() {
println!("{}--{}", prev, next);
}
}
(playground)
1--2
3--4
5--6
If you don't know that your iterator exactly fits into the chunks, you can use Tuples::into_buffer
to access any leftovers:
use itertools::Itertools; // 0.9.0
fn main() {
let some_iter = vec![1, 2, 3, 4, 5].into_iter();
let mut t = some_iter.tuples();
for (prev, next) in t.by_ref() {
println!("{}--{}", prev, next);
}
for leftover in t.into_buffer() {
println!("{}", leftover);
}
}
(playground)
1--2
3--4
5
It's also possible to take up to 4-tuple windows with Itertools::tuple_windows
:
use itertools::Itertools; // 0.9.0
fn main() {
let some_iter = vec![1, 2, 3, 4, 5, 6].into_iter();
for (prev, next) in some_iter.tuple_windows() {
println!("{}--{}", prev, next);
}
}
(playground)
1--2
2--3
3--4
4--5
5--6
If you need to get partial chunks / windows, you can get
TL;DR: The best way to have chunks
and windows
on an arbitrary iterator/collection is to first collect
it into a Vec
and iterate over that.
The exact syntax requested is impossible in Rust.
The issue is that in Rust, a function's signature is depending on types, not values, and while Dependent Typing exists, there are few languages that implement it (it's hard).
This is why chunks
and windows
return a sub-slice by the way; the number of elements in a &[T]
is not part of the type and therefore can be decided at run-time.
Let's pretend you asked for: for slice in some_iter.windows(2)
instead then.
Where would the storage backing this slice live?
It cannot live:
LinkedList
doesn't have a contiguous storage Iterator::Item
, there is no lifetime availableSo, unfortunately, slices can only be used when the backing storage is a slice.
If dynamic allocations are accepted, then it is possible to use Vec<Iterator::Item>
as the Item
of the chunking iterator.
struct Chunks<I: Iterator> {
elements: Vec<<I as Iterator>::Item>,
underlying: I,
}
impl<I: Iterator> Chunks<I> {
fn new(iterator: I, size: usize) -> Chunks<I> {
assert!(size > 0);
let mut result = Chunks {
underlying: iterator, elements: Vec::with_capacity(size)
};
result.refill(size);
result
}
fn refill(&mut self, size: usize) {
assert!(self.elements.is_empty());
for _ in 0..size {
match self.underlying.next() {
Some(item) => self.elements.push(item),
None => break,
}
}
}
}
impl<I: Iterator> Iterator for Chunks<I> {
type Item = Vec<<I as Iterator>::Item>;
fn next(&mut self) -> Option<Self::Item> {
if self.elements.is_empty() {
return None;
}
let new_elements = Vec::with_capacity(self.elements.len());
let result = std::mem::replace(&mut self.elements, new_elements);
self.refill(result.len());
Some(result)
}
}
fn main() {
let v = vec!(1, 2, 3, 4, 5);
for slice in Chunks::new(v.iter(), 2) {
println!("{:?}", slice);
}
}
Will return:
[1, 2] [3, 4] [5]
The canny reader will realize that I surreptitiously switched from windows
to chunks
.
windows
is more difficult, because it returns the same element multiple times which require that the element be Clone
. Also, since it needs returning a full Vec
each time, it will need internally to keep a Vec<Vec<Iterator::Item>>
.
This is left as an exercise to the reader.
Finally, a note on performance: all those allocations are gonna hurt (especially in the windows
case).
The best allocation strategy is generally to allocate a single chunk of memory and then live off that (unless the amount is really massive, in which case streaming is required).
It's called collect::<Vec<_>>()
in Rust.
And since the Vec
has a chunks
and windows
methods (by virtue of implementing Deref<Target=[T]>
), you can then use that instead:
for slice in v.iter().collect::<Vec<_>>().chunks(2) {
println!("{:?}", slice);
}
for slice in v.iter().collect::<Vec<_>>().windows(2) {
println!("{:?}", slice);
}
Sometimes the best solutions are the simplest.
Since Rust 1.51 this is possible with const generics where the iterator yields constant size arrays [T; N]
for any N
.
I built the little itermore
crate which does this. It provides .chunks()
and .windows()
methods for any iterator.
for [a, b, c] in some_iter.chunks() {
...
}
for [prev, next] in some_iter.windows() {
...
}
Using the example given in the Itertools
answer:
use itermore::IterMore;
fn main() {
let some_iter = vec![1, 2, 3, 4, 5, 6].into_iter();
for [prev, next] in some_iter.chunks() {
println!("{}--{}", prev, next);
}
}
This outputs
1--2
3--4
5--6
Most times the array size can be inferred but you can also specific it explicitly. Additionally, any reasonable size N
can be used, there is no limit like in the Itertools
case.
use itermore::IterMore;
fn main() {
let mut iter = vec![1, 2, 3, 4, 5, 6].into_iter().windows::<5>();
println!("{:?}", iter.next());
println!("{:?}", iter.next());
println!("{:?}", iter.next());
}
This outputs
Some([1, 2, 3, 4, 5])
Some([2, 3, 4, 5, 6])
None
Note: .windows()
uses clone to yield elements multiple times so its best used for references and cheap to copy types.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With