This is theoretical question. I am aware that perhaps best practice would be the use of shared libraries. But I ran into this question and cannot seem to find an answer anywhere.
How to construct the code and compile in ELF format a program in C/C++ so that it can be loaded with dlopen()
?
For example if one executable contains implementation of some function int test()
and I would like to call this function from my program (and preferably get the result of function), if that is even possible, how would I go about doing that?
In pseudocode I could describe it as follows:
ELF executable source:
void main() {
int i = test();
printf("Returned: %d", i);//Prints "Returned: 5"
}
int test() {
return 5;
}
External program:
// ... Somehow load executable from above
void main() {
int i = test();
printf("Returned: %d", i);//Must print "Returned: 5"
}
ELF executable are not relocatable and they usually compiled to start at the same start address (0x400000 for x86_64) which means it's technically not possible to load two of them in the same address space.
What you could do is either:
Compile the executable you want to dlopen()
as an executable shared-library (-pie
). Technically this file is an ELF shared object but can be executed. You can check if the program is an ELF executable or an ELF shared object with readelf -h my_program
or file my_program
. (As a bonus, by compiling your program as a shared object you will be able to benefit from ASLR).
By compiling your main program as a shared object (so that it is loaded at another place in the virtual address space) you should be able to dynamically link the other executable. The GNU dynamic linker does not want to dlopen
an executable file so you'd have to make the dynamic linking yourself (you probably do not want to do this).
Or you can link one of your executables to use another base address by using a linker script. Same as before, you'd have to do the work of the dynamic linker yourself.
The called executable:
// hello.c
#include <string.h>
#include <stdio.h>
void hello()
{
printf("Hello world\n");
}
int main()
{
hello();
return 0;
}
The caller executable:
// caller.c
#include <dlfcn.h>
#include <stdio.h>
int main(int argc, char** argv)
{
void* handle = dlopen(argv[1], RTLD_LAZY);
if (!handle) {
fprintf(stderr, "%s\n", dlerror());
return 1;
}
void (*hello)() = dlsym(handle, "hello");
if (!hello) {
fprintf(stderr, "%s\n", dlerror());
return 1;
}
hello();
return 0;
}
Trying to make it work:
$ gcc -fpie -pie hello.c -o hello $ gcc caller.c -o caller $ ./caller ./hello ./hello: undefined symbol: hello
The reason is that when you compile hello as a PIE, the dynamic linker does not add the hell symbol to the dynamic symbol table (.dynsym
):
$ readelf -s Symbol table '.dynsym' contains 12 entries: Num: Value Size Type Bind Vis Ndx Name 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND 1: 0000000000000200 0 SECTION LOCAL DEFAULT 1 2: 0000000000000000 0 NOTYPE WEAK DEFAULT UND _ITM_deregisterTMCloneTab 3: 0000000000000000 0 FUNC GLOBAL DEFAULT UND puts@GLIBC_2.2.5 (2) 4: 0000000000000000 0 FUNC GLOBAL DEFAULT UND __libc_start_main@GLIBC_2.2.5 (2) 5: 0000000000000000 0 NOTYPE WEAK DEFAULT UND __gmon_start__ 6: 0000000000000000 0 NOTYPE WEAK DEFAULT UND _Jv_RegisterClasses 7: 0000000000000000 0 NOTYPE WEAK DEFAULT UND _ITM_registerTMCloneTable 8: 0000000000000000 0 FUNC WEAK DEFAULT UND __cxa_finalize@GLIBC_2.2.5 (2) 9: 0000000000200bd0 0 NOTYPE GLOBAL DEFAULT 24 _edata 10: 0000000000200bd8 0 NOTYPE GLOBAL DEFAULT 25 _end 11: 0000000000200bd0 0 NOTYPE GLOBAL DEFAULT 25 __bss_start Symbol table '.symtab' contains 67 entries: Num: Value Size Type Bind Vis Ndx Name [...] 52: 0000000000000760 18 FUNC GLOBAL DEFAULT 13 hello [...]
In order to fix this, you need to pass the -E
flag to ld
(see @AlexKey's anwser):
$ gcc -fpie -pie hello.c -Wl,-E hello.c -o hello $ gcc caller.c -o caller $ ./caller ./hello Hello world $ ./hello Hello world $ readelf -s ./hello Symbol table '.dynsym' contains 22 entries: Num: Value Size Type Bind Vis Ndx Name [...] 21: 00000000000008d0 18 FUNC GLOBAL DEFAULT 13 hello [...]
For more information, 4. Dynamically Loaded (DL) Libraries from the Program Library HOWTO is a good place to start reading.
Based on links provided in comments and other answers here is how it can be done without linking these programs compile time:
test1.c:
#include <stdio.h>
int a(int b)
{
return b+1;
}
int c(int d)
{
return a(d)+1;
}
int main()
{
int b = a(3);
printf("Calling a(3) gave %d \n", b);
int d = c(3);
printf("Calling c(3) gave %d \n", d);
}
test2.c:
#include <dlfcn.h>
#include <stdio.h>
int (*a_ptr)(int b);
int (*c_ptr)(int d);
int main()
{
void* lib=dlopen("./test1",RTLD_LAZY);
a_ptr=dlsym(lib,"a");
c_ptr=dlsym(lib,"c");
int d = c_ptr(6);
int b = a_ptr(5);
printf("b is %d d is %d\n",b,d);
return 0;
}
Compilation:
$ gcc -fPIC -pie -o test1 test1.c -Wl,-E
$ gcc -o test2 test2.c -ldl
Execution results:
$ ./test1
Calling a(3) gave 4
Calling c(3) gave 5
$ ./test2
b is 6 d is 8
References:
PS: In order to avoid symbol clashes imported symbols and pointers they assigned to better have different names. See comments here.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With