Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Manually optimize a nested loop

Tags:

c

optimization

I'm working on a homework assignment where I must manually optimize a nested loop (my program will be compiled with optimizations disabled). The goal of the assignment is to run the entire program in less than 6 seconds (extra credit for less than 4.5 seconds).

I'm only allowed to change a small block of code, and the starting point is such:

    for (j=0; j < ARRAY_SIZE; j++) {
        sum += array[j];
    }

Where ARRAY_SIZE is 9973. This loop is contained within another loop that is run 200,000 times. This particular version runs in 16 seconds.

What I've done so far is change the implementation to unroll the loop and use pointers as my iterator:

(These declarations are not looped over 200,000 times)

 register int unroll_length = 16;
 register int *unroll_end = array + (ARRAY_SIZE - (ARRAY_SIZE % unroll_length));
 register int *end = array + (ARRAY_SIZE -1);
 register int *curr_end;

curr_end = end;

while (unroll_end != curr_end) {
 sum += *curr_end;
 curr_end--;
}

do {
 sum += *curr_end + *(curr_end-1) + *(curr_end-2) + *(curr_end-3) +
  *(curr_end-4) + *(curr_end-5) + *(curr_end-6) + *(curr_end-7) +
  *(curr_end-8) + *(curr_end-9) + *(curr_end-10) + *(curr_end-11) +
  *(curr_end-12) + *(curr_end-13) + *(curr_end-14) + *(curr_end-15);
}
while ((curr_end -=  unroll_length) != array);

sum += *curr_end;

Using these techniques, I was able to get the execution down to 5.5 seconds, which will give me full credit. However; I sure do want to earn the extra credit, but I'm also curious what additional optimizations I can make that I might be overlooking?

Edit #1 (Adding outer loop)

 srand(time(NULL));
 for(j = 0; j < ARRAY_SIZE; j++) {
  x = rand() / (int)(((unsigned)RAND_MAX + 1) / 14);
  array[j] = x;
  checksum += x;
 }

 for (i = 0; i < N_TIMES; i++) {

  // inner loop goes here

  if (sum != checksum)
   printf("Checksum error!\n");

  sum = 0;

 }
like image 797
JMP Avatar asked May 10 '11 15:05

JMP


People also ask

How do you optimize a nested loop?

To optimize nested loops it is necessary to consider both the inner loop and the outer loop performance, especially when the inner loop count is small for execution of each outer loop. In many typical DSP applications, loops comprise a majority of the number of cycles, or MIPS.

How do you speed up a loop in C++?

For the access to a an element you will have to do the index calculation yourself. But as one of the indices (j) is constant over the loop, you can compute the start element's index before the loop and inside just increment the index 1. That might get you some significant improvement.


1 Answers

you could try to store your variables in CPU register with :

register int *unroll_limit = array + (ARRAY_SIZE - (ARRAY_SIZE % 10));
register int *end = array + ARRAY_SIZE;
register int *curr;

and try with different size of manual loops to check when you maximize cache usage.

like image 147
Cédric Julien Avatar answered Sep 24 '22 23:09

Cédric Julien