I have a critical loop in my code with this shape :
int myloop(int a, .....){
/* some stuff */
// Critical loop
while(...){
/* Some Stuff */
if(a == 1){
// .....
}
else if(a == 2){
// .....
}
else if(a == 3){
// .....
}
else{
// ....
}
}
}
As the loop never touches the value of "a" the branch taken will never change, but as this loop is really heavy it will require to test the value of "a" many times, which is totally unnecessary. The best thing is probably to duplicate the loop, so that the "if" can be tested before the loop begins, but this would mean copying a lot of stuff common to both situations and will result in a very ugly code...
Is there any way to ask GCC/G++ to duplicate this code when it compiles it ? Or any other trick to avoid testing the value so many times ?
Thank you for your help !
Nathann
First of all, you can use a switch
statement here:
switch(a) {
case 0:
// handle a==0
break;
case 1:
// handle a==1
break;
default:
// handle all other cases
}
This may enable the compiler to produce quicker code, i.e. do a single computed jump rather than multiple checks against a
.
this would mean copying a lot of stuff common to both situations
Refactor! What about putting the shared code into a separate function, probably declare it inline
, and hope that the compiler follows the hint? Function inlining is a good way to let the compiler do code duplication (the other two ways being templates and the preprocessor, both are clearly inappropriate here).
inline void sharedStuff() {...}
int myloop(int a, .....){
/* some stuff */
if (a==1) {
while(...){
// code specific to a==1
// do shared stuff
sharedStuff();
}
}
else if ...
}
Of course it depends on what you do in your loop, but you should get the basic principle.
Last but not least: profile. Check if the loop really is the performance bottleneck. Have a look at the generated machine code. Chances are good that the compiler does already use most of the proposed optimizations.
One common way of doing this is as follows:
// make function implementation inline...
static inline int myloop_inline(int a, .....){
/* some stuff */
// Critical loop
while(...){
/* Some Stuff */
if(a == 1){
// .....
}
else if(a == 2){
// .....
}
else if(a == 3){
// .....
}
else{
// ....
}
}
}
// wrapper function which calls myloop_inline with hard-coded values of a
int myloop(int a, .....){
{
switch (a)
{
// expand inline function for all performance-critical values of a...
case 1:
myloop_inline(1);
break;
case 2:
myloop_inline(2);
break;
case 3:
myloop_inline(3);
break;
...
// for any values of a for which the function is not performance-critical
// we can just use a default case...
default:
myloop_inline(a);
break;
}
}
Note that because a
is passed as a literal constant when myloop_inline()
is called form myloop()
the compiler can optimise away all irrelevant tests and dead code when it expands the inline function.
You may want to take steps to ensure that myloop_inline()
actually gets inlined, this includes compiling with optimisation enabled of course (e.g. -O3
), and in the case of e.g. gcc you might want to add __attribute__ ((always_inline))
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With