I want to optimize the use of different functions in a CFD code, which the user can choose at runtime via a config file which is read by the program.
I came up with a minimal working example, where there are two separate functions with one input. One squares the input and one cubes it. Via command line option the user can choose which function to use. The code the squares/cubes a bunch of numbers (it calculates the integral from 0 to 1 of either x^2 or x^3, depending which function was chosen) in a for loop and outputs the results. The first variant is just a switch case in a for loop (case1). The second thing I tries was a function pointer, which is set before the loop (case2). The third thing I did was selectively compiling only the function that the user intends to use with the use of preprocessor commands (case3).
case1:
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
double f_square(double x) {return x * x;}
double f_cube(double x) {return x * x * x;}
int main(int argc, char *argv[])
{
double x;
double sum = 0;
double del_x = 4e-10;
printf("Speed test -- no optimisation\n");
clock_t startClock = clock();
for (x = 0; x < 1; x += del_x) {
switch (argv[1][0]) {
case '2':
sum += f_square(x) * del_x;
break;
case '3':
sum += f_cube(x) * del_x;
break;
default:
printf("Invalid choice! Abort\n");
exit(1);
}
}
clock_t endClock = clock();
printf("Int_{0}^{1} x^%c: %.8g\n", argv[1][0], sum);
printf("Execution time: %.6f\n", (endClock - startClock) / (double)CLOCKS_PER_SEC);
}
case2:
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
double f_square(double x) {return x * x;}
double f_cube(double x) {return x * x * x;}
int main(int argc, char *argv[])
{
double x;
double sum = 0;
double del_x = 4e-10;
double (*f)(double);
printf("Speed test -- function pointers\n");
switch (argv[1][0]) {
case '2':
f = &f_square;
break;
case '3':
f = &f_cube;
break;
default:
printf("Invalid choice! Abort\n");
exit(1);
}
clock_t startClock = clock();
for (x = 0; x < 1; x += del_x) {
sum += f(x) * del_x;
}
clock_t endClock = clock();
printf("Int_{0}^{1} x^%c: %.8g\n", argv[1][0], sum);
printf("Execution time: %.6f\n", (endClock - startClock) / (double)CLOCKS_PER_SEC);
}
case3:
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#ifdef SQUARE
double f(double x) {return x * x;}
#endif
#ifdef CUBE
double f(double x) {return x * x * x;}
#endif
int main(void)
{
double x;
double sum = 0;
double del_x = 4e-10;
printf("Speed test -- selective compilation\n");
clock_t startClock = clock();
for (x = 0; x < 1; x += del_x) {
sum += f(x) * del_x;
}
clock_t endClock = clock();
#ifdef SQUARE
printf("Int_{0}^{1} x^2: %.8g\n", sum);
#endif
#ifdef CUBE
printf("Int_{0}^{1} x^3: %.8g\n", sum);
#endif
printf("Execution time: %.6f\n", (endClock - startClock) / (double)CLOCKS_PER_SEC);
}
When measuring the execution times I found something odd:
Here are some images comparing the execution times
This confuses me and I would like to know what I can do in order to use function pointers without losing performance, because for flexibility reasons I really want to use function pointers.
=> Why are function pointers so slow?
I want to add that I am not a software engineer but rather an aerospace engineering student and sadly we do not get a lot of programming lessons, so every little detail might be helpful.
Here is a disassembly view of two implementations of similar functions: https://c.godbolt.org/z/l24Zhl
Note that with -O2, the first method inlines the calls to f_cube and f_square (note no calls to the functions in the assembly), but the second version does not.
Most likely, the first version is then further sped-up due to Branch Prediction on the processor.
Have you profiled your code and found that this area is a bottleneck? Remember that you make the greatest speed gains by optimizing the most-used code first. Remember: first make it work, then make it fast.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With