Consider the following code:
let u: Vec<u8> = (64..74).collect();
let v: Vec<u8> = u.iter().map(|i| i + 1).collect();
u
was not moved, therefore v
was inevitably newly allocated.
But if I do the following:
let w: Vec<u8> = u.into_iter().map(|i| i + 1).collect();
u
was moved and w
is the name of its transformation. Here is some pseudo-code representing what I mean:
mark u as "moved"
for i = 0..10:
u[i] += 1
w = u
There is (in my opinion) no need for a new allocation, since we map a type to itself. This wouldn't be the case for this code:
let t: Vec<u8> = (64..74).collect();
let s: String = t.into_iter().map(|i| i as char).collect();
To summarize my question
Is there an allocation of a new Vec
when we convert a Vec
into an iterator and then map this iterator to an iterator on elements of the same type and then collect the result into a Vec
?
And if there is indeed an allocation, why?
I tried to --emit=mir
, but I wasn't able to find the answer. I'm using rustc 1.20 nightly (if that matters).
If you want to play with code: Try it online!
Interestingly enough, the order in which three vectors are added has no effect upon either the magnitude or the direction of the resultant. The resultant will still have the same magnitude and direction. For example, consider the addition of the same three vectors in a different order.
One such operation is the addition of vectors. Two vectors can be added together to determine the result (or resultant). This process of adding two or more vectors has already been discussed in an earlier unit.
Two vectors can be added together to determine the result (or resultant). This process of adding two or more vectors has already been discussed in an earlier unit.
If there is ever a need to exchange the contents of vectors, there's another simple function in the form of std::vector::swap. It takes a vector as an argument which is passed by reference, and the vectors have their contents exchanged. The vector passed in shouldn't, therefore, be const.
Let's see the source of the implementation of into_iter()
for Vec<T>
:
fn into_iter(mut self) -> IntoIter<T> {
unsafe {
let begin = self.as_mut_ptr();
assume(!begin.is_null());
let end = if mem::size_of::<T>() == 0 {
arith_offset(begin as *const i8, self.len() as isize) as *const T
} else {
begin.offset(self.len() as isize) as *const T
};
let cap = self.buf.cap();
mem::forget(self);
IntoIter {
buf: Shared::new(begin),
cap: cap,
ptr: begin,
end: end,
}
}
}
Creating the IntoIter
iterator incurs several extra allocations, but not for the elements of the vector; instead, the vector's underlying memory details are registered. How about the code behind map()
?
fn map<B, F>(self, f: F) -> Map<Self, F> where
Self: Sized, F: FnMut(Self::Item) -> B,
{
Map{iter: self, f: f}
}
No extra vectors allocated here either. The last piece of the puzzle is collect()
:
fn collect<B: FromIterator<Self::Item>>(self) -> B where Self: Sized {
FromIterator::from_iter(self)
}
No answers here; what about the implementation of from_iter()
for Vec<T>
?
impl<T> FromIterator<T> for Vec<T> {
#[inline]
fn from_iter<I: IntoIterator<Item = T>>(iter: I) -> Vec<T> {
<Self as SpecExtend<T, I::IntoIter>>::from_iter(iter.into_iter())
}
}
This is beginning to look like magic, but perhaps the related SpecExtend code will reveal what we're looking for:
impl<T, I> SpecExtend<T, I> for Vec<T>
where I: Iterator<Item=T>,
{
default fn from_iter(mut iterator: I) -> Self {
// Unroll the first iteration, as the vector is going to be
// expanded on this iteration in every case when the iterable is not
// empty, but the loop in extend_desugared() is not going to see the
// vector being full in the few subsequent loop iterations.
// So we get better branch prediction.
let mut vector = match iterator.next() {
None => return Vec::new(),
Some(element) => {
let (lower, _) = iterator.size_hint();
let mut vector = Vec::with_capacity(lower.saturating_add(1));
unsafe {
ptr::write(vector.get_unchecked_mut(0), element);
vector.set_len(1);
}
vector
}
};
<Vec<T> as SpecExtend<T, I>>::spec_extend(&mut vector, iterator);
vector
}
default fn spec_extend(&mut self, iter: I) {
self.extend_desugared(iter)
}
}
In this code we can finally see the Vec::new
and Vec::with_capacity
methods called which allocate fresh space for the resulting vector.
TL;DR: no, it is not possible to move and modify a vector without an extra allocation.
Is there an allocation of a new Vec when we convert a Vec into an iterator and then map this iterator to an iterator on elements of the same type and then collect the result into a Vec?
Yes, the necessary optimization to avoid that allocation is simply too high-level to perform for any of the compiler tiers since they would have to reason about heap allocations and the pointer magic going on in the unsafe code blocks that underpin the Vec
implementation. Doubly so in the general case where drop-handlers, panics inside the map
and similar things come into play.
It might be possible to optimize at the library level by specializingcollect() -> Vec<T>
for specific types such as std::iter::Map<std::vec::IntoIter<T>, _>
, but those optimizations would have to be applied on a case-by-case basis where it is known to be safe.
if I understand correctly, your code is approximately the Rust translation of my pseudo-code, without the original "functional style".
For the functional style there is a plan to add for_each to iterators, so if you take an iter_mut something equivalent could be achieved with far fewer heroics on the library or compiler side.
And it would actually reflect the intent of doing something in place more directly instead of just doing some optimization magically which then could surprisingly break under slight modification.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With