Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why are multi-dimensional arrays in .NET slower than normal arrays?

Edit: I apologize everybody. I used the term "jagged array" when I actually meant to say "multi-dimensional array" (as can be seen in my example below). I apologize for using the incorrect name. I actually found jagged arrays to be faster than multi-dimensional ones! I have added my measurements for jagged arrays.

I was trying to use a jagged multi-dimensional array today, when I noticed that it's performance is not as I would have expected. Using a single-dimensional array and manually calculating indices was much faster (almost two times) than using a 2D array. I wrote a test using 1024*1024 arrays (initialized to random values), for 1000 iterations, and I got the following results on my machine:

sum(double[], int): 2738 ms (100%) sum(double[,]):     5019 ms (183%) sum(double[][]):    2540 ms ( 93%) 

This is my test code:

public static double sum(double[] d, int l1) {     // assuming the array is rectangular     double sum = 0;     int l2 = d.Length / l1;     for (int i = 0; i < l1; ++i)         for (int j = 0; j < l2; ++j)             sum += d[i * l2 + j];     return sum; }  public static double sum(double[,] d) {     double sum = 0;     int l1 = d.GetLength(0);     int l2 = d.GetLength(1);     for (int i = 0; i < l1; ++i)         for (int j = 0; j < l2; ++j)             sum += d[i, j];     return sum; }  public static double sum(double[][] d) {     double sum = 0;     for (int i = 0; i < d.Length; ++i)         for (int j = 0; j < d[i].Length; ++j)             sum += d[i][j];     return sum; }  public static void Main() {     Random random = new Random();     const int l1  = 1024, l2 = 1024;     double[ ] d1  = new double[l1 * l2];     double[,] d2  = new double[l1 , l2];     double[][] d3 = new double[l1][];      for (int i = 0; i < l1; ++i) {         d3[i] = new double[l2];         for (int j = 0; j < l2; ++j)             d3[i][j] = d2[i, j] = d1[i * l2 + j] = random.NextDouble();     }     //     const int iterations = 1000;     TestTime(sum, d1, l1, iterations);     TestTime(sum, d2, iterations);     TestTime(sum, d3, iterations); } 

Further investigation showed that the IL for the second method is 23% larger than that of the first method. (Code size 68 vs 52.) This is mostly due to calls to System.Array::GetLength(int). The compiler also emits calls to Array::Get for the jagged multi-dimensional array, whereas it simply calls ldelem for the simple array.

So I am wondering, why is access through multi-dimensional arrays slower than normal arrays? I would have assumed the compiler (or JIT) would do something similar to what I did in my first method, but this was not actually the case.

Could you plese help me understand why this is happening the way it is?


Update: Following Henk Holterman's suggestion, here is the implementation of TestTime:

public static void TestTime<T, TR>(Func<T, TR> action, T obj,                                    int iterations) {     Stopwatch stopwatch = Stopwatch.StartNew();     for (int i = 0; i < iterations; ++i)         action(obj);     Console.WriteLine(action.Method.Name + " took " + stopwatch.Elapsed); }  public static void TestTime<T1, T2, TR>(Func<T1, T2, TR> action, T1 obj1,                                         T2 obj2, int iterations) {     Stopwatch stopwatch = Stopwatch.StartNew();     for (int i = 0; i < iterations; ++i)         action(obj1, obj2);     Console.WriteLine(action.Method.Name + " took " + stopwatch.Elapsed); } 
like image 618
Hosam Aly Avatar asked Jan 22 '09 11:01

Hosam Aly


People also ask

What are the differences between a multidimensional array and an array of arrays in C#?

In a multidimensional array, each element in each dimension has the same, fixed size as the other elements in that dimension. In a jagged array, which is an array of arrays, each inner array can be of a different size. By only using the space that's needed for a given array, no space is wasted.

Why is 1D array faster than 2-D array?

Speed: The 1D array may be faster than the 2D array because the memory for the 2D array would not be contiguous, so cache misses would become a problem.


2 Answers

Single dimensional arrays with a lower bound of 0 are a different type to either multi-dimensional or non-0 lower bound arrays within IL (vector vs array IIRC). vector is simpler to work with - to get to element x, you just do pointer + size * x. For an array, you have to do pointer + size * (x-lower bound) for a single dimensional array, and yet more arithmetic for each dimension you add.

Basically the CLR is optimised for the vastly more common case.

like image 78
Jon Skeet Avatar answered Oct 21 '22 09:10

Jon Skeet


Array bounds checking?

The single-dimension array has a length member that you access directly - when compiled this is just a memory read.

The multidimensional array requires a GetLength(int dimension) method call that processes the argument to get the relevant length for that dimension. That doesn't compile down to a memory read, so you get a method call, etc.

In addition that GetLength(int dimension) will do a bounds check on the parameter.

like image 45
JeeBee Avatar answered Oct 21 '22 09:10

JeeBee