Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the most efficient way of storing data between a multi-dimension array, and a single array?

Essentially I'm not sure how to store a 3D data structure for the fastest access possible as I'm not sure what is going on under the hood for multi-dimensional arrays.

NOTE: The arrays will be a constant and known size each and every time, and each element will be exactly 16 bits.

Option one is to have a multi-dimension array data[16, 16, 16] and simply access via data[x, y, z] option two is to have a single dimension array data[16 * 16 * 16] and access via data[x + (y * 16) + (z * 16 * 16)].

As each element should only be 16 bits long, and I have a suspicion that a multi-dimension array would store a lot of references to other arrays internally at a minimum of 32 bits per one, that is a lot of wasted memory. However, I fear it may be faster than running the equation specified in option two each time, and speed is key to this project.

So, can anyone enlighten me as to how much difference in speed there would likely to be compared to how much difference in memory consumption?

like image 261
Benjamin James Drury Avatar asked Mar 12 '23 12:03

Benjamin James Drury


1 Answers

C# stores multidimensional arrays as a single block of memory, so they compile to almost the same thing. (One difference is that there are three sets of bounds to check).

I.e. arr[x,y,z] is just about equivalent to arr[x + y*ny +z*nz*ny] and will generally have similar performance characteristics.

The exact performance however will be dominated by the pattern of memory access, and how this affects cache coherence (at least for large amounts of data). You may find that nested loops over x, then y then z may be faster or slower than doing the loops in a different order, if one does a better job of keeping currently used data in the processor cache.

This is highly dependent on the exact algorithm, so it isn't possible to give an answer which is correct for all algorithms.

The other cause of any speed reduction versus C or C++ is the bounds-checking, which will still be needed in the one-dimensional array case. However these will often, but not always, be removed automatically.

  • https://blogs.msdn.microsoft.com/clrcodegeneration/2009/08/13/array-bounds-check-elimination-in-the-clr/

Again, the exact algorithm will affect whether the optimiser is able to remove the bounds checks.

Your course of action should be as follows:

  • Write a naïve version of the algorithm with arr[x,y,z].
  • If it's fast enough you can stop.
  • Otherwise profile the algorithm to check it is actually array accesses which are the issue, analyse the memory access patterns and so on.
like image 142
Ben Avatar answered Apr 08 '23 20:04

Ben