Optimization of C code

Tags:

For an assignment of a course called High Performance Computing, I required to optimize the following code fragment:

int foobar(int a, int b, int N) {     int i, j, k, x, y;     x = 0;     y = 0;     k = 256;     for (i = 0; i <= N; i++) {         for (j = i + 1; j <= N; j++) {             x = x + 4*(2*i+j)*(i+2*k);             if (i > j){                y = y + 8*(i-j);             }else{                y = y + 8*(j-i);             }         }     }     return x; }

Using some recommendations, I managed to optimize the code (or at least I think so), such as:

Constant Propagation
Algebraic Simplification
Copy Propagation
Common Subexpression Elimination
Dead Code Elimination
Loop Invariant Removal
bitwise shifts instead of multiplication as they are less expensive.

Here's my code:

Click to copy

int foobar(int a, int b, int N) {      int i, j, x, y, t;     x = 0;     y = 0;     for (i = 0; i <= N; i++) {         t = i + 512;         for (j = i + 1; j <= N; j++) {             x = x + ((i<<3) + (j<<2))*t;         }     }     return x; }

According to my instructor, a well optimized code instructions should have fewer or less costly instructions in assembly language level.And therefore must be run, the instructions in less time than the original code, ie calculations are made with::

execution time = instruction count * cycles per instruction

When I generate assembly code using the command: gcc -o code_opt.s -S foobar.c,

the generated code has many more lines than the original despite having made some optimizations, and run-time is lower, but not as much as in the original code. What am I doing wrong?

Do not paste the assembly code as both are very extensive. So I'm calling the function "foobar" in the main and I am measuring the execution time using the time command in linux

Click to copy

int main () {     int a,b,N;      scanf ("%d %d %d",&a,&b,&N);     printf ("%d\n",foobar (a,b,N));     return 0; }

836

asked Nov 25 '12 21:11

franvergara66

1 Answers

Initially:

Click to copy

for (i = 0; i <= N; i++) {     for (j = i + 1; j <= N; j++) {         x = x + 4*(2*i+j)*(i+2*k);         if (i > j){            y = y + 8*(i-j);         }else{            y = y + 8*(j-i);         }     } }

Removing y calculations:

Click to copy

for (i = 0; i <= N; i++) {     for (j = i + 1; j <= N; j++) {         x = x + 4*(2*i+j)*(i+2*k);     } }

Splitting i, j, k:

Click to copy

for (i = 0; i <= N; i++) {     for (j = i + 1; j <= N; j++) {         x = x + 8*i*i + 16*i*k ;                // multiple of  1  (no j)         x = x + (4*i + 8*k)*j ;                 // multiple of  j     } }

Moving them externally (and removing the loop that runs N-i times):

Click to copy

for (i = 0; i <= N; i++) {     x = x + (8*i*i + 16*i*k) * (N-i) ;     x = x + (4*i + 8*k) * ((N*N+N)/2 - (i*i+i)/2) ; }

Rewritting:

Click to copy

for (i = 0; i <= N; i++) {     x = x +         ( 8*k*(N*N+N)/2 ) ;     x = x +   i   * ( 16*k*N + 4*(N*N+N)/2 + 8*k*(-1/2) ) ;     x = x +  i*i  * ( 8*N + 16*k*(-1) + 4*(-1/2) + 8*k*(-1/2) );     x = x + i*i*i * ( 8*(-1) + 4*(-1/2) ) ; }

Rewritting - recalculating:

Click to copy

for (i = 0; i <= N; i++) {     x = x + 4*k*(N*N+N) ;                            // multiple of 1     x = x +   i   * ( 16*k*N + 2*(N*N+N) - 4*k ) ;   // multiple of i     x = x +  i*i  * ( 8*N - 20*k - 2 ) ;             // multiple of i^2     x = x + i*i*i * ( -10 ) ;                        // multiple of i^3 }

Another move to external (and removal of the i loop):

Click to copy

x = x + ( 4*k*(N*N+N) )              * (N+1) ; x = x + ( 16*k*N + 2*(N*N+N) - 4*k ) * ((N*(N+1))/2) ; x = x + ( 8*N - 20*k - 2 )           * ((N*(N+1)*(2*N+1))/6); x = x + (-10)                        * ((N*N*(N+1)*(N+1))/4) ;

Both the above loop removals use the summation formulas:

Sum(1, i = 0..n) = n+1
Sum(i¹, i = 0..n) = n(n + 1)/2
Sum(i², i = 0..n) = n(n + 1)(2n + 1)/6
Sum(i³, i = 0..n) = n²(n + 1)²/4

answered Sep 28 '22 22:09

ypercubeᵀᴹ

Related questions
                            
                                How to get only part of URL from HttpServletRequest?
                            
                                using jsoup with proguard closing force close
                            
                                Every time I run Vmware, I get this error: Error While powering on: The VMware Authorization Service is not running [closed]
                            
                                cmake generate Xcode project from existing sources
                            
                                Is toString called for primitive types also?
                            
                                Install numpy in Python virtualenv
                            
                                bootstrap 3 pagination with codeigniter
                            
                                Nginx is throwing an 403 Forbidden on Static Files
                            
                                how can i make headerView scroll (not stay on the top of the tableview ) accompanying with UItableViewCell when i was scrolling tableview
                            
                                FTP Publish Error in Visual Studio 2013
                            
                                Ionic full screen background image
                            
                                bootstrap 3.2.0 glyphicons are not displaying in internet explorer

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Optimization of C code

Tags:

franvergara66

People also ask

1 Answers

ypercubeᵀᴹ

Recent Activity

Donate For Us