Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Performance Impact of Nested Vectors vs. Contiguous Arrays

Have there been any reliable tests that clearly display the performance differences between accessing and writing to nested vectors versus C++'s built-in arrays? I've heard that using nested (multi-dimensional) vectors typically have some performance overhead as compared to accessing elements in a single array (where all elements are stored in contiguous memory), but this just all seems to be hypothetical to me. I have yet to see any tests that actually show these differences. Are they significant? I'm sure that it depends on the scenario, but as an inexperienced programmer, I'm not quite sure at what level these differences do become significant.

like image 652
JohnTravolski Avatar asked Aug 18 '17 03:08

JohnTravolski


People also ask

Are vectors more efficient than arrays?

A std::vector can never be faster than an array, as it has (a pointer to the first element of) an array as one of its data members. But the difference in run-time speed is slim and absent in any non-trivial program. One reason for this myth to persist, are examples that compare raw arrays with mis-used std::vectors.

What is advantage of using vectors instead of arrays?

Arrays cannot be returned unless dynamically allocated from a function whereas vectors can be returned from a function. Arrays cannot be copied or assigned directly whereas Vectors can be copied or assigned directly.

What are the disadvantages of vector over arrays?

Disadvantages of Vector A vector is an object, memory consumption is more.

Are arrays more efficient than vectors C++?

Array is memory efficient data structure. Vector takes more time in accessing elements. Array access elements in constant time irrespective of their location as elements are arranged in a contiguous memory allocation.


2 Answers

It definitely depends on the scenario, to the extent that I don't think it's possible to answer in a general way which approach is fastest. The fastest approach is going to be the one where the access patterns have the best data locality - which depends highly on the access pattern as well as how the structures are laid out in memory, which in the case of nested vectors is dependent on the allocator and probably varies quite a bit between compilers.

I'd follow the general rule of optimization, which is to first write things in the most straightforward way and then attempt optimization when you can prove there is a bottleneck.

like image 89
gcbenison Avatar answered Oct 14 '22 08:10

gcbenison


Two things contribute to the runtime differences between nested and flattened arrays:
Caching behaviour and indirection

  • CPUs use a hierarchy of Caches to avoid accessing the RAM directly too frequently. This exploits the fact that most memory accesses are either contiguous or have a certain temporal locality, i.e. what was accessed recently will be accessed again soon.
    This means that if the innermost arrays of your nested array is rather large, you will notice small to no differences to a flat array if you iterate over the values in a contiguous fashion. This means that when iterating over a range of values, for flat arrays, your innermost loop should iterate over consecutive elements, for nested arrays, your innermost loop should iterate over the innermost array.
  • If however your access patterns are random, the most important difference in timing are indirections:
    For a flat array, you use something like A[(Z * M + Y) * N + X], so you do 4 arithmetic operations and then a memory access.
    For a nested array, you use A[Z][Y][X], so there are actually three interdependent memory accesses: You need to know A[Z] before you can access A[Z][Y] and so on. Because of the superscalar architecture of modern CPUs, operations that can be executed in parallel are especially efficient, interdependent operations not so much. So you have some arithmetic operations and a memory load on the one side and three interdependent loads on the other side, which is significantly slower. It might be possible that for nested arrays, the contents of A and also A[Z] for some values of Z can be found in the cache hierarchy, but if your nested array is sufficiently large, it will never fit completely into the cache, thus leading to multiple cache misses and memory loads (nested) instead of just a single cache miss and load (flat) for a single random access into the array.

Also see his question, especially the shorter answers below for a more detailed discussion of caching (my answer) and indirection (Peter's answer), which also provides an example where there are no noticeable differences between nested and flat arrays (after fixing the indexing bug of course ;) )

So if you want to know if there are significant runtime differences between them, my answer would be:

  • If you do random access, you will definitely notice the multiple indirections, thus leading to a larger runtime for nested arrays.

  • If you do contiguous access and use the correct ordering of the loops (innermost loop = innermost array/innermost index for flat array) and your innermost dimension of the multi-dimensional array is large enough, then the difference will be negligable, since the compiler will be able to move all indirections out of the innermost loop.

like image 3
Tobias Ribizel Avatar answered Oct 14 '22 08:10

Tobias Ribizel