I'm developing for an embedded platform and I'm having a hard time working out how to link shared libraries dynamically. I'm using the bFLT file format and I don't have control over where the executable and shared library is loaded.
My loader correctly loads the shared library and executable into memory and modifies the executable's GOT at run time to link to the shared library.
I can successfully take the address of the function and I know it's correct from disassembling the code at that location. However, if I try to call the function, the whole thing crashes.
Turns out GCC adds a 'code veneer' when calling shared library functions and takes a detour when the function is called and doesn't actually branch to the address of the function. The address that the code veneer branches to isn't relocated properly because it doesn't show up in the list of relocations in the executable binary.
The disassembly of the veneer looks like this:
000008d0 <__library_call_veneer>:
8d0: e51ff004 ldr pc, [pc, #-4] ; 8d4 <__library_call_veneer+0x4>
8d4: 03000320 .word 0x03000320 ; This address isn't correctly relocated!
If I take the address of the function and put it into a function pointer (therefore, bypassing the 'code veneer') and call it, the shared library works perfectly.
So for example:
#define DIRECT_LIB_CALL(x, args...) do { \
typeof(x) * volatile tmp = x; \
tmp(#args); \
} while (0)
DIRECT_LIB_CALL(library_call); /* works */
library_call(); /* crashes */
Is there a way to either, tell GCC to not produce a code veneer and branch directly to the address located in the GOT or somehow make the address that the code veneer branches show up in the list of relocations to perform?
In computing, position-independent code (PIC) or position-independent executable (PIE) is a body of machine code that, being placed somewhere in the primary memory, executes properly regardless of its absolute address.
Position-independent code is not tied to a specific address. This independence allows the code to execute efficiently at a different address in each process that uses the code. Position-independent code is recommended for the creation of shared objects.
Position Independent Code means that the generated machine code is not dependent on being located at a specific address in order to work. E.g. jumps would be generated as relative rather than absolute.
A shared library or shared object is a file that is intended to be shared by multiple programs. Symbols used by a program are loaded from shared libraries into memory at load time or runtime.
I found a workaround to this problem. It's not the best or cleanest method but it does the job in my case.
I took advantage of the --wrap
option in my linker which redirects symbols to __wrap_symbol
. With this, I set up a awk script that automatically generates ASM files that load a properly relocated address into the pc. Any library calls would be redirected to this code. Basically what I did was make my own code veneers. Since the generated code veneer wasn't being referenced, it simply got optimized away.
Additionally, I had to place my veneers in the .data section since anything in the .text section was not relocated correctly. Since, the platform I'm working on doesn't differentiate between code and data that much, this hacky workaround works.
Here's a link to the project I'm working on where you can look up the specifics.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With