Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Manual indexing over jagged arrays?

I'm at a performance bottleneck in my program where I need to access elements from an array millions of times in a tight loop.

I looked around and the general consensus seems to be that even though multidimensional arrays should be faster, their underlying implementation is inefficient so just use jagged arrays instead. I profiled it and sure enough, jagged arrays are 50% faster. Fine.

However, I also tried just doing manual indexing (as in, simulating the behavior of a multidimensional array by just doing something like this: object value = array[i * 24 + j]; (where 24 is an array size) and accessing it through a single-dimension array with multiplication to simulate multidimensional arrays.

Surprisingly, this was also faster than a jagged array by around 15% for accesses (all I care about). This saddens me because for one, it seems silly that manually recreating multidimensional arrays is this much faster than C#'s built-in implementation and two, the math involved to get to indicies is uglier compared to just indexing with jagged/multidimensional arrays.

Is there anything I can do to recoup the speed benefits without having to use my own manual indexing? Surely there is some sort of optimization that can be set or checked to simulate this behavior? Why is the C# implementation of arrays so inefficient?

like image 762
John Smith Avatar asked Mar 12 '15 15:03

John Smith


1 Answers

Surprisingly, this was also faster than a jagged array by around 15% for accesses

This should come as no surprise at all, because indexing a jagged array requires an additional dereference. When you write a[i][j], computer must do the following:

  • Compute the location of nested array i inside the jagged array a
  • Fetch the location of the nested array a[i] (first dereference)
  • Compute the location of element j in a[i]
  • Fetch the value at location j of a[i] (second dereference)

When you fold 2D array in a vector, the computer does only one dereference:

  • Compute the location of the target element
  • Fetch the value from the array (the only dereference)

Essentially, you are trading a dereference for a multiplication; multiplication is cheaper.

In addition, you get continuity of elements in memory - something that you cannot guarantee with a jagged array. This becomes important for code that is sensitive to cache hits.

Is there anything I can do to recoup the speed benefits without having to use my own manual indexing?

Using your indexing scheme is a way to go. You can hide it from viewers of your code by making a class, say, Matrix2D exposing an operator [] that takes two indexes and produces the value. This way the code that computes the offset would be hidden from readers of your program, because the a[i * 24 + j] part would look like a[i, j]

like image 123
Sergey Kalinichenko Avatar answered Sep 30 '22 02:09

Sergey Kalinichenko