I have been tasked with optimizing a particular for loop in C. Here is the loop:
#define ARRAY_SIZE 10000
#define N_TIMES 600000
for (i = 0; i < N_TIMES; i++)
{
int j;
for (j = 0; j < ARRAY_SIZE; j++)
{
sum += array[j];
}
}
I'm supposed to use loop unrolling, loop splitting, and pointers in order to speed it up, but every time I try to implement something, the program doesn't return. Here's what I've tried so far:
for (i = 0; i < N_TIMES; i++)
{
int j,k;
for (j = 0; j < ARRAY_SIZE; j++)
{
for (k = 0; k < 100; k += 2)
{
sum += array[k];
sum += array[k + 1];
}
}
}
I don't understand why the program doesn't even return now. Any help would be appreciated.
That second piece of code is both inefficient and wrong, since it adds values more than the original code.
The loop unrolling (or lessening in this case since you probably don't want to unroll a ten-thousand-iteration loop) would be:
// Ensure ARRAY_SIZE is a multiple of two before trying this.
for (int i = 0; i < N_TIMES; i++)
for (int j = 0; j < ARRAY_SIZE; j += 2)
sum += array[j] + array[j+1];
But, to be honest, the days of dumb compilers has long since gone. You should generally leave this level of micro-optimisation up to your compiler, while you concentrate on the more high-level stuff like data structures, algorithms and human analysis.
That last one is rather important. Since you're adding the same array to an accumulated sum a constant number of times, you only really need the sum of the array once, then you can add that partial sum as many times as you want:
int temp = 0;
for (int i = 0; i < ARRAY_SIZE; i++)
temp += array[i];
sum += temp * N_TIMES;
It's still O(n)
but with a much lower multiplier on the n
(one rather than six hundred thousand). It may be that gcc's insane optimisation level of -O3
could work that out but I doubt it. The human brain can still outdo computers in a lot of areas.
For now, anyway :-)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With