Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Optimization of C code

Tags:

For an assignment of a course called High Performance Computing, I required to optimize the following code fragment:

int foobar(int a, int b, int N) {     int i, j, k, x, y;     x = 0;     y = 0;     k = 256;     for (i = 0; i <= N; i++) {         for (j = i + 1; j <= N; j++) {             x = x + 4*(2*i+j)*(i+2*k);             if (i > j){                y = y + 8*(i-j);             }else{                y = y + 8*(j-i);             }         }     }     return x; } 

Using some recommendations, I managed to optimize the code (or at least I think so), such as:

  1. Constant Propagation
  2. Algebraic Simplification
  3. Copy Propagation
  4. Common Subexpression Elimination
  5. Dead Code Elimination
  6. Loop Invariant Removal
  7. bitwise shifts instead of multiplication as they are less expensive.

Here's my code:

int foobar(int a, int b, int N) {      int i, j, x, y, t;     x = 0;     y = 0;     for (i = 0; i <= N; i++) {         t = i + 512;         for (j = i + 1; j <= N; j++) {             x = x + ((i<<3) + (j<<2))*t;         }     }     return x; } 

According to my instructor, a well optimized code instructions should have fewer or less costly instructions in assembly language level.And therefore must be run, the instructions in less time than the original code, ie calculations are made with::

execution time = instruction count * cycles per instruction

When I generate assembly code using the command: gcc -o code_opt.s -S foobar.c,

the generated code has many more lines than the original despite having made ​​some optimizations, and run-time is lower, but not as much as in the original code. What am I doing wrong?

Do not paste the assembly code as both are very extensive. So I'm calling the function "foobar" in the main and I am measuring the execution time using the time command in linux

int main () {     int a,b,N;      scanf ("%d %d %d",&a,&b,&N);     printf ("%d\n",foobar (a,b,N));     return 0; } 
like image 836
franvergara66 Avatar asked Nov 25 '12 21:11

franvergara66


People also ask

What is optimization in embedded C?

Embedded C - Optimization techniques. Opens in a new window Opens an external website Opens an external website in a new window. This website stores data such as cookies to enable essential site functionality, as well as marketing, personalization, and analytics.

What is optimizing the code?

We say that code optimization is writing or rewriting code so a program uses the least possible memory or disk space, minimizes its CPU time or network bandwidth, or makes the best use of additional cores. In practice, we sometimes default to another definition: Writing less code.

How do compilers optimize code?

Compiler optimization is generally implemented using a sequence of optimizing transformations, algorithms which take a program and transform it to produce a semantically equivalent output program that uses fewer resources or executes faster.


1 Answers

Initially:

for (i = 0; i <= N; i++) {     for (j = i + 1; j <= N; j++) {         x = x + 4*(2*i+j)*(i+2*k);         if (i > j){            y = y + 8*(i-j);         }else{            y = y + 8*(j-i);         }     } } 

Removing y calculations:

for (i = 0; i <= N; i++) {     for (j = i + 1; j <= N; j++) {         x = x + 4*(2*i+j)*(i+2*k);     } } 

Splitting i, j, k:

for (i = 0; i <= N; i++) {     for (j = i + 1; j <= N; j++) {         x = x + 8*i*i + 16*i*k ;                // multiple of  1  (no j)         x = x + (4*i + 8*k)*j ;                 // multiple of  j     } } 

Moving them externally (and removing the loop that runs N-i times):

for (i = 0; i <= N; i++) {     x = x + (8*i*i + 16*i*k) * (N-i) ;     x = x + (4*i + 8*k) * ((N*N+N)/2 - (i*i+i)/2) ; } 

Rewritting:

for (i = 0; i <= N; i++) {     x = x +         ( 8*k*(N*N+N)/2 ) ;     x = x +   i   * ( 16*k*N + 4*(N*N+N)/2 + 8*k*(-1/2) ) ;     x = x +  i*i  * ( 8*N + 16*k*(-1) + 4*(-1/2) + 8*k*(-1/2) );     x = x + i*i*i * ( 8*(-1) + 4*(-1/2) ) ; } 

Rewritting - recalculating:

for (i = 0; i <= N; i++) {     x = x + 4*k*(N*N+N) ;                            // multiple of 1     x = x +   i   * ( 16*k*N + 2*(N*N+N) - 4*k ) ;   // multiple of i     x = x +  i*i  * ( 8*N - 20*k - 2 ) ;             // multiple of i^2     x = x + i*i*i * ( -10 ) ;                        // multiple of i^3 } 

Another move to external (and removal of the i loop):

x = x + ( 4*k*(N*N+N) )              * (N+1) ; x = x + ( 16*k*N + 2*(N*N+N) - 4*k ) * ((N*(N+1))/2) ; x = x + ( 8*N - 20*k - 2 )           * ((N*(N+1)*(2*N+1))/6); x = x + (-10)                        * ((N*N*(N+1)*(N+1))/4) ; 

Both the above loop removals use the summation formulas:

Sum(1, i = 0..n) = n+1
Sum(i1, i = 0..n) = n(n + 1)/2
Sum(i2, i = 0..n) = n(n + 1)(2n + 1)/6
Sum(i3, i = 0..n) = n2(n + 1)2/4

like image 58
ypercubeᵀᴹ Avatar answered Sep 28 '22 22:09

ypercubeᵀᴹ