vector::operator[] overhead

Tags:

Apparently, after profiling my (scientific computation) C++ code, 25% (!) of the time is spent with calls to vector::operator[]. True, my code spends all of its time reading and writing in vector<float>s (and a few vector<int>s too), but still, I'd like to know if there's supposed to be some significant overhead of operator[] compared to C-style arrays?

(I've seen another related question on SO, but regarding [] vs at() -- but apparently even [] is too slow for me?!)

Thanks, Antony

(edit: just for info: using g++ -O3 version 4.5.2 on Ubuntu)

889

asked May 24 '11 12:05

antony

2 Answers

std::vector::operator[] should be rather efficient, however the compiler must be paranoid and for every call made to a function it must assume that the vector could have been moved somewhere else in memory.

For example in this code

for (int i=0,n=v.size(); i<n; i++)
{
    total += v[i] + foo();
}

if the code of foo isn't known in advance the compiler is forced to reload the address of vector start every time because the vector could have been reallocated as a consequence of code inside foo().

If you know for sure that the vector is not going to be moved in memory or reallocated then you can factor out this lookup operation with something like

double *vptr = &v[0]; // Address of first element
for (int i=0,n=v.size(); i<n; i++)
{
    total += vptr[i] + foo();
}

with this approach one memory lookup operation can be saved (vptr is likely to end up in a register for the whole loop).

Also another reason for inefficiency may be cache trashing. To see if this is the problem an easy trick is to just over-allocate your vectors by some uneven number of elements.

The reason is that because of how caching works if you have many vectors with e.g. 4096 elements all of them will happen to have the same low-order bits in the address and you may end up losing a lot of speed because of cache line invalidations. For example this loop on my PC

std::vector<double> v1(n), v2(n), v3(n), v4(n), v5(n);
for (int i=0; i<1000000; i++)
    for (int j=0; j<1000; j++)
    {
        v1[j] = v2[j] + v3[j];
        v2[j] = v3[j] + v4[j];
        v3[j] = v4[j] + v5[j];
        v4[j] = v5[j] + v1[j];
        v5[j] = v1[j] + v2[j];
    }

executes in about 8.1 seconds if n == 8191 and in 3.2 seconds if n == 10000. Note that the inner loop is always from 0 to 999, independently of the value of n; what is different is just the memory address.

Depending on the processor/architecture I've observed even 10x slowdowns because of cache trashing.

146

answered Oct 19 '22 09:10

6502

In a modern compiler, in release mode, with optimisations enabled, there is no overhead in using operator [] compared to raw pointers: the call is completely inlined and resolves to just a pointer access.

~~I’m guessing that you are somehow copying the return value in the assignment and that this is causing the real 25% time spent in the instruction.~~[Not relevant for float and int]

Or the rest of your code is simply blazingly fast.

answered Oct 19 '22 09:10

Konrad Rudolph

Related questions
                            
                                C++ STL sort() function, binary predicate
                            
                                string not optimized enough for string literals
                            
                                Why is :: (scope) used with empty left-hand operand? [duplicate]
                            
                                How to enforce use of template specialization?
                            
                                C++ To filter a class vector using algorithm
                            
                                Euler angle to Quaternion then Quaternion to euler angle
                            
                                How to "zero" everything within a masked part of an image in OpenCV
                            
                                cannot declare variable ‘’ to be of abstract type ‘’
                            
                                Why can not omit braces when initializing map?
                            
                                What does c_str() method from string class returns?
                            
                                Safe to use += operator to create a new std::map entry using []?
                            
                                Convert char* to wchar_t* using mbstowcs_s
                            
                                Since which version of C++ are default arguments allowed?
                            
                                C++ class initialisation containing class variable initialization
                            
                                Advantages of const in C++? [duplicate]
                            
                                Right way to conditionally initialize a C++ member variable?
                            
                                When to use pointers, and when not to use them
                            
                                Pointer to [-1]th index of array
                            
                                One question about function definition in C++
                            
                                Convert Lat/Longs to X/Y Co-ordinates

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With