Let's say I have to iterate over a potentially very large vector of numbers and copy the even and odd elements into new, separate vectors. (The source vector may have any proportion of evens to odds; it could be all evens, all odds, or somewhere in-between.)
For simplicity, push_back
is often used for this sort of thing:
for (std::size_t Index; Index < Source.size(); Index++)
{
if (Source[Index] % 2) Odds.push_back(Source[Index]);
else Evens.push_back(Source[Index]);
}
However, I'm worried that this will be inefficient and be harmful if it's used as part of the implementation for something like a sorting algorithm, where performance is paramount. QuickSort, for example, involves separating elements much like this.
You could use reserve()
to allocate memory before-hand so only one allocation is needed, but then you have to iterate over the entire source vector twice - once to count how many elements will need to be sorted out, and once more for the actual copying.
You could, of course, allocate the same amount of space as the source vector's size, since neither new vector will need to hold more than that, but that seems somewhat wasteful.
Is there a better method that I'm missing? Is push_back()
usually trusted to manage this sort of thing for the programmer, or can it become burdensome for sensitive algorithms?
I'm going to answer the question I think you really meant to ask, which is "should push_back()
be avoided in the inner loops of heavy algorithms?" rather than what others seem to have read into your post, which is "does it matter if I call push_back before doing an unrelated sort on a large vector?" Also, I'm going to answer from my experience rather than spend time chasing down citations and peer-reviewed articles.
Your example is basically doing two things that add up to the total CPU cost: it's reading and operating on elements in the input vector, and then it has to insert the elements into the output vector. You're concerned about the cost of inserting elements because:
malloc()
is just slow, even when pedants pretend that new
is something different)Your instinct, therefore, is correct: always pre-reserve space for your vectors where possible, not because push_back is slow, but because it can trigger a reallocation that is slow. Also, if you look at the implementation of shrink_to_fit
, you'll see it also does a copy reallocation, temporarily doubling your memory cost and causing further fragmentation.
Your problem here is that you don't always know exactly how much space you'll need for the output vectors; the usual response is to use a heuristic and maybe a custom allocator. Reserve n/2+k of the input size for each of your output vectors by default, where k is some safety margin. That way you'll usually have enough space for the output, so long as your input is reasonably balanced, and push_back can reallocate in the rare cases where it's not. If you find that push_back's exponential behavior is wasting too much memory ( causing you to reserve 2n elements when really you just needed n+2 ), you can give it a custom allocator that expands the vector size in smaller, linear chunks — but of course that will be much slower in cases where the vectors are really unbalanced and you end up doing lots of resizes.
There's no way to always reserve the exact right amount of space without walking the input elements in advance; but if you know what the balance usually looks like, you can use a heuristic to make a good guess at it for a statistical performance gain over many iterations.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With