Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What are the details of dynamic symbol binding on OS X?

Tags:

c

macos

dyld

I have a really odd situation with dynamic symbol binding on OS X that I'm hoping to get some clues on how to resolve.

I have an application, written in C, which uses dlopen() to dynamically load modules at runtime. Some of these modules export global symbols, which may be used by other modules loaded later.

We have one module (which I'll call weird_module.so) which exports global symbols, one of which is weird_module_function. If weird_module.so gets linked with a particular library (which I'll call libsomething.dylib), then weird_module_function can't be bound to. But if I remove the -lsomething when linking weird_module.so, then I can bind to weird_module_function.

What could possibly be going on with libsomething.dylib that would cause weird_module.so to not export symbols? Are there things I can do to debug how symbols get exported (similar to how I can use DYLD_PRINT_BINDINGS to debug how they get bound)?

$ LDFLAGS="-bundle -mmacosx-version-min=10.6 -Xlinker -undefined -Xlinker dynamic_lookup /usr/lib/bundle1.o"

$ gcc -o weird_module.so ${LDFLAGS} weird_module.o -lsomething
$ nm weird_module.so | grep '_weird_module_function$'
00000000000026d0 T _weird_module_function

$ gcc -o other_module.so ${LDFLAGS} other_module.o -lsomething
$ nm other_module.so | grep '_weird_module_function$'
                 U _weird_module_function

$ run-app
Loading weird_module.so
Loading other_module.so
dyld: lazy symbol binding failed: Symbol not found: _weird_module_function
  Referenced from: other_module.so
  Expected in: flat namespace

dyld: Symbol not found: _weird_module_function
  Referenced from: other_module.so
  Expected in: flat namespace

# Now relink without -lsomething
$ gcc -o weird_module.so ${LDFLAGS} weird_module.o
$ nm weird_module.so | grep '_weird_module_function$'
00000000000026d0 T _weird_module_function
$ run-app
Loading weird_module.so
Loading other_module.so
# No error!

Edit:

I tried putting together a minimal app to duplicate the problem, and in the course of doing so at least figured it out one thing we were doing wrong. There are two other pertinent facts relevant to duplicating the issue.

First is that run-app preloads the module with RTLD_LAZY | RTLD_LOCAL to inspect its metadata. The module is then dlclose()ed and reopened with either RTLD_LAZY | RTLD_GLOBAL or RTLD_NOW | RTLD_LOCAL, depending on the metadata. (For both modules in question, it reopens with RTLD_LAZY | RTLD_GLOBAL).

Secondly, there turns out to be a symbol collision in weird_module.so and libsomething.dylib for a const global.

$ nm weird_module.so | grep '_something_global`
00000000000158f0 S _something_global

$ nm libsomething.dylib | grep '_something_global'
0000000000031130 S _something_global

I'm willing to consider that the duplicate symbol would put me in the realm of undefined behavior, so I'm dropping the question.

like image 906
leedm777 Avatar asked Sep 09 '13 15:09

leedm777


1 Answers

I tried to reproduce your scenario and I was able to get the same errors as you, i.e. dyld: lazy symbol binding failed followed by dyld: Symbol not found.

But it had nothing to do with linking against libsomething.dylib or not. What I did to trigger this error was just calling weird_module_function() from the constructor of other_module.so:

//  other_module.c

#import <stdio.h>
#import "weird_module.h"

__attribute__((constructor)) void initialize_other_module(void)
{
    printf("%s\n", __PRETTY_FUNCTION__);
    weird_module_function();
}

Here is how I loaded the modules:

//  main.c

#import <stdio.h>
#import <dlfcn.h>

int main(int argc, const char * argv[])
{
    printf("\nLoading weird module\n");
    void *weird = dlopen("weird_module.so", RTLD_LAZY | RTLD_LOCAL);
    printf("weird: %p\n\n", weird);

    printf("Loading other module\n");
    void *other = dlopen("other_module.so", RTLD_LAZY | RTLD_LOCAL);
    printf("other: %p\n", other);

    return 0;
}

The dyld errors disappear if I remove the RTLD_LOCAL option when loading weird_module.so.

The same error also occurs if you call weird_module_function from a libsomething.dylib constructor but it happens before main is called so that’s probably not what is happening to you.

But maybe the libsomething.dylib constructor is where you should look to find how libsomething.dylib is influencing your modules loading process. You can set the DYLD_PRINT_INITIALIZERS environment variable to YES in order to find out what constructors are called.

A few other things to check:

  1. Are you 100% sure that both modules are reopened with RTLD_LAZY | RTLD_GLOBAL? The only way I could get the dyld errors was by passing the RTLD_LOCAL option.
  2. Are you sure the the dlclose call is successful (returns 0)? If, for example, your module contains Objective-C code, it will not be unloaded.
like image 145
0xced Avatar answered Oct 23 '22 20:10

0xced