From the Rust standard library implementation of unzip
:
fn unzip<A, B, FromA, FromB>(self) -> (FromA, FromB) where
FromA: Default + Extend<A>,
FromB: Default + Extend<B>,
Self: Sized + Iterator<Item=(A, B)>,
{
struct SizeHint<A>(usize, Option<usize>, marker::PhantomData<A>);
impl<A> Iterator for SizeHint<A> {
type Item = A;
fn next(&mut self) -> Option<A> { None }
fn size_hint(&self) -> (usize, Option<usize>) {
(self.0, self.1)
}
}
let (lo, hi) = self.size_hint();
let mut ts: FromA = Default::default();
let mut us: FromB = Default::default();
ts.extend(SizeHint(lo, hi, marker::PhantomData));
us.extend(SizeHint(lo, hi, marker::PhantomData));
for (t, u) in self {
ts.extend(Some(t));
us.extend(Some(u));
}
(ts, us)
}
These two lines:
ts.extend(SizeHint(lo, hi, marker::PhantomData));
us.extend(SizeHint(lo, hi, marker::PhantomData));
don't actually extend ts
or us
by anything, since the next
method of SizeHint
returns None
. What's the purpose of doing so?
It is a cool trick. By giving this size hint, it gives ts
and us
the chance to reserve the space for the extend
calls in the loop. According to the documentation
size_hint()
is primarily intended to be used for optimizations such as reserving space for the elements of the iterator, but must not be trusted to e.g. omit bounds checks in unsafe code. An incorrect implementation ofsize_hint()
should not lead to memory safety violations.
Note that the creation of SizeHint
is necessary because the extend
call in for loop is made with a Some
value (Optional
implements the Iterator
trait), and the size_hint
for a Some
value is (1, Some(1))
. That doesn't help with pre allocation.
But looking at the code for Vec
, this will have no effect (neither in HashMap
and VecDeque
). Others Extend
implementations may be different.
The execution of ts.extend(SizeHint(lo, hi, marker::PhantomData));
does not trigger a resize
, since next
returns None
. Maybe some one should write a patch.
impl<T> Vec<T> {
fn extend_desugared<I: Iterator<Item = T>>(&mut self, mut iterator: I) {
// This function should be the moral equivalent of:
//
// for item in iterator {
// self.push(item);
// }
while let Some(element) = iterator.next() {
let len = self.len();
if len == self.capacity() {
let (lower, _) = iterator.size_hint();
self.reserve(lower.saturating_add(1));
}
unsafe {
ptr::write(self.get_unchecked_mut(len), element);
// NB can't overflow since we would have had to alloc the address space
self.set_len(len + 1);
}
}
}
}
It's a dubious hack!
It implements an iterator with a fake (overestimated) size hint to encourage the produced collection to reserve the eventually appropriate capacity up front.
Cool trick but, it does so by implementing a size hint where the estimated lower bound is greater than the actual number of elements produced (0). If the lower bound is not known, the iterator should return a lower bound of 0. This implementation is arguably very buggy for this reason, and the collection's Extend impl may react with bugginess as a result (but of course not memory unsafety.)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With