A common way of implementing OO-like code encapsulation and polymorphism in C is to return opaque pointers to a structure containing some function pointers. This is a very frequent pattern for example in the Linux kernel.
Using function pointers instead of function calls introduces an overhead which is mostly negligible due to caching, as has already been discussed in other questions.
However, with the new -fwhole-program and -flto optimization options for GCC (>4.6), things change.
#include <stdlib.h>
#include "libPointers.h"
void do_work(struct worker *wrk, const int i)
{
wrk->datum += i;
}
struct worker *libPointers_init(const int startDatum)
{
struct worker *wrk = malloc (sizeof (struct worker));
*wrk = (struct worker) {
.do_work = do_work,
.datum = startDatum
};
return wrk;
}
#ifndef __LIBPOINTERS_H__
#define __LIBPOINTERS_H__
struct worker {
int datum;
void (*do_work)(struct worker *, int i);
};
extern void do_work (struct worker *elab, const int i);
struct worker *libPointers_init(const int startDatum);
#endif //__LIBPOINTERS_H__
#include <stdio.h>
#include "libPointers.h"
int main (void)
{
unsigned long i;
struct worker *wrk;
wrk = libPointers_init(56);
for (i = 0; i < 1e10; i++) {
#ifdef USE_POINTERS
wrk->do_work(wrk,i);
#else
do_work(wrk,i);
#endif
}
printf ("%d\n", wrk->datum);
}
Compiling with -O3, but without -flto -fwhole-program flags, testPointers execution takes around 25s on my machine, regardless whether USE_POINTERS is #defined or not.
If I turn on the -flto -fwhole-program flags, testPointers takes around 25s with USE_POINTERS #defined, but around 14s if a function call is used.
This is completely expected behavior, since I understand that the compiler will inline and optimize the function in the loop. I wonder, however, if there's a way of helping the compiler telling it that the function pointer is constant and so allowing it to optimize that case, too.
For those using cmake, here's how I compiled it
set (CMAKE_C_FLAGS "-O3 -fwhole-program -flto")
#set (CMAKE_C_FLAGS "-O3")
add_executable(testPointers
libPointers.c
testPointers.c
)
The compiler can't inline a function unless it can determine that only one possible version of the function will be called. By calling through a pointer it's not trivially obvious that this is the case. It still might be possible for the compiler to figure it out, since if you follow the code there's only one possible value that the pointer could take; however this would be above and beyond what I'd expect the compiler to do.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With