For efficiency reasons, I always avoid writing loops like this:
for(std::size_t i = 0; i < vec.size(); ++i) { ... }
where vec
is an STL container. Instead, I either do
const std::size_t vec_size = vec.size();
for(std::size_t i = 0; i < vec_size; ++i) { ... }
or use the container iterators.
But how bad is the first solution really? I remember reading in Meyers that it will be quadratic instead of linear because the vector doesn't know its size and repeatedly has to count. But won't modern compilers detect this and optimize it away?
vector::size()
is constant-time and usually implemented as a trivial inline function that is optimised away. Don't bother hand-optimising it.
I remember reading in Meyers that it will be quadratic instead of linear because the vector doesn't know its size and repeatedly has to count.
You're getting vector
and list
confused. vector
's size value is held in the vector; list
's requires transversal of the actual list.
The easiest way to tell if something is being optimized out by the compiler is to compare the assembly-language compiler output.
That said, the two chunks of code are not actually equivalent. What if the size of the vector changes while you're iterating over it? The compiler would have to be very, very smart to prove conclusively that the vector's size could not change.
Now, in the real world, is this tiny optimization really worth the extra effort? The vec.size()
just returns a stored value. It doesn't re-count the length.
Consider the following stupid function:
void sum (vector<int>& vec, int* sumOut)
{
*sumOut = 0;
for(std::size_t i = 0; i < vec.size(); ++i)
{
*sumOut += vec[i];
}
}
The actual assembly generated will depend on the compiler and implementation of vector
, but I think in most cases, the compiler has to re-read the vector
's size from memory each time through the loop. This is because the sumOut
pointer could potentially overlap (alias) the vector's internal storage of the size (assuming the vector
stores its size in an int), so the size could be changed by the loop. If you call a function like this a lot, it could add up to a lot of cycles because you're touching memory more than you need.
Three possible solutions are:
Store the size in a local variable. Ideally, the size this will get stored in a register and avoid touching memory altogether. Even if it has to get put on the stack, the compiler should be able to order the loads/stores more efficiently.
Use __restrict
on the output
pointer. This tells the compiler
that the pointer can't possibly
overlap anything else, so writes to
it don't require reloading anything
else.
Reverse the loop. The termination
condition now checks against 0
instead, so vec.size()
is never
called again.
Of those, I think #1 is the cleanest, but some people might prefer #3. #2 is the probably least reader-friendly, but might be faster than the others (because it means the vector's data could be read more efficiently).
For more info on aliasing, see Christer Ericson's GDC presentation on memory optimization; there's an example almost identical to this in there.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With