Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

performance test between two directions of a 2D array

This code (A) executes much faster (10 times) then the second one:

for(int w=0; w<width; w++) {
        for(int h=1; h<height; h++) {
            image[h][w] = (1-a)*image[h][w] + a*image[h-1][w];
        }
    }

Second one:

for(int h=0; h<height; h++) {
        for(int w=1; w<width; w++) {
            image[h][w] = (1-a)*image[h][w] + a*image[h][w-1];
        }
    }

Why is that? it is the same to go through all of the pixels in the image in either horizontal or vertical direction.

Is there a way to speed up the second one?

Thanks in advance.

like image 947
olidev Avatar asked Jan 19 '23 13:01

olidev


1 Answers

This has to do with locality of reference. If you access the elements in the same order as they are stored in memory, this will be much faster than accessing them in a strided pattern, as the memory caches and memory bandwidth will be utilized far more effectively.

The above would explain the second version being faster than the first one, and this is exactly what happens on my box:

aix@aix:~$ time ./ver1
real    0m29.421s

aix@aix:~$ time ./ver2
real    0m2.198s

Here is the code I use to allocate the array:

  double a = 0.5;
  int width = 2048;
  int height = 2048;
  double* data = new double[height * width];
  double** image = new double*[height];
  for (int i = 0; i < height; i++) {
    image[i] = data + i * width;
  }

Version 1 times the following loop:

  for (int iter = 0; iter < 100; iter++) {
    for(int w=0; w<width; w++) {
      for(int h=1; h<height; h++) {
        image[h][w] = (1-a)*image[h][w] + a*image[h-1][w];
      }
    }
  }

Version 2 loop:

  for (int iter = 0; iter < 100; iter++) {
    for(int h=0; h<height; h++) {
      for(int w=1; w<width; w++) {
        image[h][w] = (1-a)*image[h][w] + a*image[h][w-1];
      }
    }
  }

Compiled with g++ 4.4.3 with -O3 and run on a Xeon box of some description (64-bit Ubuntu).

If you are still 100% sure that you're seeing the opposite effect, there must be something fundamentally different about what you're doing compared to what I'm doing. It might help if you told us the dimensions of your image and how exactly it gets allocated (in order to help establish the memory layout).

like image 81
NPE Avatar answered Jan 31 '23 05:01

NPE