I am trying to understand the difference in the mechanisms underlying load-time linking (using gcc -l
) versus run-time linking (using dlopen(), dlsym()
) of dynamic libraries in Linux, and how these mechanisms affect the state of the library and the addresses of its symbols.
I have three simple files:
libhello.c:
int var;
int func() {
return 7;
}
libhello.h:
extern int var;
int func();
main.c:
#include <inttypes.h>
#include <stdio.h>
#include <stdint.h>
#include <dlfcn.h>
#include "libhello.h"
int main() {
void* h = dlopen("libhello.so", RTLD_NOW);
printf("Address Load-time linking Run-time linking\n");
printf("------- ----------------- ----------------\n");
printf("&var 0x%016" PRIxPTR " 0x%016" PRIxPTR "\n", (uintptr_t)&var , (uintptr_t)dlsym(h, "var" ));
printf("&func 0x%016" PRIxPTR " 0x%016" PRIxPTR "\n", (uintptr_t)&func, (uintptr_t)dlsym(h, "func"));
}
I compile libhello.c with the command gcc -shared -o libhello.so -fPIC libhello.c
I compile main.c with the command gcc main.c -L. -lhello -ldl
Running the main.c executable prints something like this:
Address Load-time linking Run-time linking
------- ----------------- ----------------
&var 0x0000000000601060 0x00007fdb4acb1034
&func 0x0000000000400700 0x00007fdb4aab0695
The load-time linking addresses remain the same, but the run-time linking addresses change every run.
dlopen()
. The second load does not copy the state of the first load. I.e. if the value of var
is changed before dlopen()
, this value isn't reflected in the version of var
loaded via dlsym()
. Is there any way to retain this state during the second load?In load-time dynamic linking executable is linked to the DLL Library while in Runtime dynamic linking no executable was linked o any DLL.
Run-time dynamic linking enables the process to continue running even if a DLL is not available. The process can then use an alternate method to accomplish its objective. For example, if a process is unable to locate one DLL, it can try to use another, or it can notify the user of an error.
The main difference between static and dynamic linking is that static linking copies all library modules used in the program into the final executable file at the final step of the compilation while, in dynamic linking, the linking occurs at run time when both executable files and libraries are placed in the memory.
Shared libraries are the most common way to manage dependencies on Linux systems. These shared resources are loaded into memory before the application starts, and when several processes require the same library, it will be loaded only once on the system. This feature saves on memory usage by the application.
Yes, it's ASLR.
Because PIE (Position Independent Executables) is quite expensive (in performance). So many systems do the tradeoff where they randomize libraries because they have to be position independent anyway, but don't randomize executables because it costs too much performance. Yes, it is more vulnerable this way, but most security is a tradeoff.
Yes, don't search symbols through the handle, instead use RTLD_DEFAULT
. It's generally a bad idea to have two instances of the same dynamic library loaded like this. Some systems can just skip loading a library in dlopen
if they know the same library is already loaded and what the dynamic linker considers "the same library" can change depending on your library path. You're very much in the territory of quite badly/weakly defined behavior here that has evolved over the years more to deal with bugs and problems and less through deliberate design.
Note that RTLD_DEFAULT
will return the address of the symbol in the main executable or the first (load time) loaded dynamic library and the dynamically loaded library will be ignored.
Also, another thing worth keeping in mind is that if you reference var
in libhello it will always resolve the symbol from the load time version of the library even in the dlopen:ed version. I modified func
to return var
and added this code to your example code:
int (*fn)(void) = dlsym(h, "func");
int *vp;
var = 17;
printf("%d %d %d %p\n", var, func(), fn(), vp);
vp = dlsym(h, "var");
*vp = 4711;
printf("%d %d %d %p\n", var, func(), fn(), vp);
vp = dlsym(RTLD_DEFAULT, "var");
*vp = 42;
printf("%d %d %d %p\n", var, func(), fn(), vp);
and get this output:
$ gcc main.c -L. -lhello -ldl && LD_LIBRARY_PATH=. ./a.out
17 17 17 0x7f2e11bec02c
17 17 17 0x7f2e11bec02c
42 42 42 0x601054
Address Load-time linking Run-time linking
------- ----------------- ----------------
&var 0x0000000000601054 0x0000000000601054
&func 0x0000000000400700 0x0000000000400700
What you see depends on many variables. Here on a Debian 64bit I got in my first try
Address Load-time linking Run-time linking
------- ----------------- ----------------
&var 0x0000000000600d58 0x0000000000600d58
&func 0x00000000004006d0 0x00000000004006d0
Which means, that dlopen used the already linked library, which your system seems not to do. To get advantage of ASLR, you need to compile main.c
with position independend code: gcc -fPIC main.c ./libhello.so -ldl
.
Address Load-time linking Run-time linking
------- ----------------- ----------------
&var 0x00007f4e6cec6944 0x00007f4e6cec6944
&func 0x00007f4e6ccc6670 0x00007f4e6ccc6670
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With