The availability of some platform-specific features, such as SSE or AVX, can be determined during runtime, which is very useful, if do not want to compile and ship different objects for the different features.
The following code for example allows me to check for AVX and compiles with gcc, which provides the cpuid.h
header:
#include "stdbool.h"
#include "cpuid.h"
bool has_avx(void)
{
uint32_t eax, ebx, ecx, edx;
__get_cpuid(1, &eax, &ebx, &ecx, &edx);
return ecx & bit_AVX;
}
Instead of littering the code with runtime checks, such as the above, that repeatedly perform the checks, are slow and introduce branching (the checks could be cached to reduce the overhead, but there would be branching nonetheless), I figured that I could use the infrastructure provided by the dynamic linker/loader.
Calls to functions with external linkage on platforms with ELF are already indirect and go through the Procedural Linkage Table/PLT and Global Offset Table/GOT.
Suppose there are two internal functions, a basic _do_something_basic
that always and a somehow optimized version _do_something_avx
, which uses AVX. I could export a generic do_something
symbol, and alias it to the basic add:
static void _do_something_basic(…) {
// Basic implementation
}
static void _do_something_avx(…) {
// Optimized implementation using AVX
}
void do_something(…) __attribute__((alias("_do_something_basic")));
During load-time of my library or program, I would like to check the availability of AVX once using has_avx
and depending on the result of the check point the do_something
symbol to _do_something_avx
.
Even better would be, if I could point the initial version of the do_something
symbol to a self-modifying function that checks the availability of AVX using has_avx
and replaces itself with _do_something_basic
or _do_something_avx
.
In theory this should be possible, but how can I find the location of PLT/GOT programmatically? Is there an ABI/API provided the ELF loader, e.g. ld-linux.so.2, that I could use for this? Do I need a linker script to obtain the PLT/GOT location? What about security considerations, can I even write to the PLT/GOT, if I obtain a pointer to it?
Maybe some project has done this or something very similar already.
I'm fully aware, that the solution would be highly platform-specific, but since I'm already having to deal with low-level platform-specific details, like features of the instruction set, this is fine.
Global offset tables hold absolute addresses in private data. Addresses are therefore available without compromising the position-independence and shareability of a program's text.
The global offset table converts position-independent address calculations to absolute locations. Similarly the procedure linkage table converts position-independent function calls to absolute locations.
The Global Offset Table, or GOT, is a section of a computer program's (executables and shared libraries) memory used to enable computer program code compiled as an ELF file to run correctly, independent of the memory address where the program's code or data is loaded at runtime.
PLT¶ Before a functions address has been resolved, the GOT points to an entry in the Procedure Linkage Table (PLT). This is a small "stub" function which is responsible for calling the dynamic linker with (effectively) the name of the function that should be resolved.
As others have suggested you can go with platform-specific versions of libs. Or if you are ok with sticking to Linux, you can use the (relatively) new IFUNC relocations which do exactly what you want.
EDIT: As noted by Sebastian, IFUNCs seem to also be supported by other platforms (FreeBSD, Android). Note however, that the feature is not that widely used so may have some rough edges.
A simple way to do what you're asking for is to use your own function pointers instead of modifying those in the PLT.
For example:
extern void (*do_something)(...);
void
_do_something(...) {
if (has_avx()) {
do_something = _do_something_avx;
} else {
do_something = _do_something_basic;
}
do_something(...);
}
void (*do_something)(...) = _do_something;
While this is cumbersome if you have a lot of these functions, doing it this way does't require any special compiler or linker features. (Though if you need to the functions to be thread safe on a platform where reading and writing pointers isn't atomic you'll need to make them atomic somehow. This isn't a problem on x86 platforms however.) If you do have a lot these functions, macros or C++ templates can help keep the typing down.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With