Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

LD_PRELOAD doesn't affect dlopen() with RTLD_NOW

If I use a function from a shared library directly, i.e. by declaring it in my code and linking during compile time, LD_PRELOAD works fine. But if I use dlopen()/dlsym() instead LD_PRELOAD has no effect!

The problem is that I want to debug a program that loads some plugins using dlopen(), and it uses absolute file names at that, so simply using LD_LIBRARY_PATH won't work.

Here's a sample code which illustrates the problem.

./libfoo.so

void foo() {
    printf("version 1\n");
}

./preload/libfoo.so

void foo() {
    printf("version 2\n");
}

main.c

#include <stdio.h>
#include <dlfcn.h>
void foo();
int main(int argc, char *argv[]) {
    void (*pfoo)();
    foo(); // call foo() first so we are sure ./preload/libfoo.so is loaded when we call dlopen()
    pfoo = dlsym(dlopen("libfoo.so", RTLD_NOW), "foo");
    pfoo();
    return 0;
}

command line

LD_PRELOAD=preload/libfoo.so LD_LIBRARY_PATH=. ./a.out

output

version 2
version 1

Why doesn't LD_PRELOAD affect dlopen(), and is there any way to redirect dlopen(), especially when using absolute paths?

like image 703
sashoalm Avatar asked Jun 13 '16 14:06

sashoalm


2 Answers

Specifying LD_PRELOAD will cause the loader to unconditionally load (and initialize) the indicated shared libraries prior to loading the main executable. This makes the symbols defined in the preloaded libraries available prior to linking main, allowing the interposition of symbols. [Note 1]

So, in your example, the call to foo() uses the symbol from the preloaded module, and dlsym would return the same symbol if you had called it with a NULL handle.

However, the call to dlopen does not take into account the symbol you are looking for (for obvious reasons). It just loads the indicated shared object or returns a handle to an already-cached version of the shared object. It does not add the module to a list of modules to load if necessary; it simply loads the module. And when you pass the returned handle to dlsym, dlsym looks in precisely that module in order to resolve the symbol, rather than searching the set of external symbols present in the executable. [Note 2]

As I mentioned, dlopen will not load the "same" shared object more than once, if it already has loaded the object. [Note 3]. However, the shared object in your LD_PRELOAD is called preload/libfoo.so, not libfoo.so. (ELF does not strip directory paths from shared object names, unlike certain other operating systems.) So when you call dlopen("libfoo.so"), the dynamic loader is not going to find any shared object named libfoo.so in the cache of loaded shared objects, and it will therefore look for that object in the filesystem, using the library search paths since the provided filename does not contain a /.

As it turns out, ELF does allow you to specify the name of a shared object. So you can set the name of the preloaded module to the name which you will later dynamically load, and then dlopen will return the handle to the preloaded module.

We start by correcting the version of main.c in the original question:

#include <stdio.h>
#include <dlfcn.h>
void foo();
int main(int argc, char *argv[]) {
    const char* soname = argc > 1 ? argv[1] : "libfoo.so";
    void (*pfoo)();
    pfoo = dlsym(NULL, "foo"); // Find the preloaded symbol, if any.
    if (pfoo) pfoo(); else puts("No symbol foo before dlopen.");
    void* handle = dlopen(soname, RTLD_NOW);
    if (handle) {
      pfoo = dlsym(handle, "foo"); // Find the symbol in the loaded SO
      if (pfoo) pfoo(); else puts("No symbol foo after dlopen.");
    }
    else puts("dlopen failed to find the shared object.");
    return 0;
}

This can be built without specifying any libraries other than libdl:

gcc -Wall -o main main.c -ldl

If we build the two shared libraries with no specified names, which is probably what you did:

gcc -Wall -o libfoo.so -shared -fPIC libfoo.c
gcc -Wall -o preload/libfoo.so -shared -fPIC preload/libfoo.c

then we observe that the dlopen/dlsym finds the symbol in the loaded module:

$ LD_PRELOAD=preload/libfoo.so LD_LIBRARY_PATH=. ./main libfoo.so
version 2
version 1

However, if we assign the name being looked for to the preloaded shared object, we get a different behaviour:

$ gcc -Wall -o preload/libfoo.so -Wl,--soname=libfoo.so -shared -fPIC preload/libfoo.c
$ LD_PRELOAD=preload/libfoo.so LD_LIBRARY_PATH=. ./main libfoo.so
version 2
version 2

That works because the dlopen is looking for the shared object named libfoo.so. However, it is more likely that an application loading plugins will use a filename rather than using the library search path. And that will cause the preloaded shared object to not be considered, because the names no longer match:

$ LD_PRELOAD=preload/libfoo.so LD_LIBRARY_PATH=. ./main ./libfoo.so
version 2
version 1

As it happens, we can make this work by building the shared library with the name actually being looked for:

$ gcc -Wall -o preload/libfoo.so -Wl,--soname=./libfoo.so -shared -fPIC preload/libfoo.c
$ LD_PRELOAD=preload/libfoo.so LD_LIBRARY_PATH=. ./main libfoo.so
version 2
version 2

That's a bit of a hack, IMHO, but it is acceptable for debugging. [Note 4]

Notes:

  1. Consequently, the comment "call foo() first so we are sure ./preload/libfoo.so is loaded" is incorrect; the preloaded module is preloaded, not added to a list of modules to load if necessary.

  2. If you want dlsym to just look for a symbol, you can pass a NULL handle. In that case, dlsym will search in modules loaded by dlopen (including modules required by the module loaded by dlopen). But that's rarely what you want, since applications which load plugins with dlsym normally specify a particular symbol (or symbols) which the plugin must define, and these symbols will be present in every loaded plugin, making the lookup by symbol name imprecise.

  3. This is not quite correct, but dynamic symbol namespaces are outside the scope of this answer.

  4. Other hacks are possible, of course. You could, for example, interpose your own version of dlopen to override the shared object name lookup. But that's probably a lot more work than necessary.

like image 148
rici Avatar answered Sep 20 '22 06:09

rici


According to http://linux.die.net/man/3/dlopen

The four functions dlopen(), dlsym(), dlclose(), dlerror() implement the interface to the dynamic linking loader.

Whereas the LD_PRELOAD only affects the dynamic linker itself -- ie: ld.so (http://linux.die.net/man/8/ld.so). The only way I can think of forcing dlopen to resolve as you want is through chroot.

Followup thought:

Another thought I just had, what if you write a wrapper that FIRST loads the correct *.so THEN invokes the program you are trying to redirect. Does this cause the child process to use the redirected *.so ?

like image 26
user590028 Avatar answered Sep 21 '22 06:09

user590028