Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

RTLD_GLOBAL and Two Level Namespaces on macOS

After reading the Apple documentation for Executing Mach-O files it says:

The two-level namespace feature of OS X v10.1 and later adds the module name as part of the symbol name of the symbols defined within it. This approach ensures a module’s symbol names don’t conflict with the names used in other modules.

So in my example I am loading python2 and python3 into the same process. Both Python libs are (by default) compiled with the two-level namespace option. Both libs are also loaded with the RTLD_GLOBAL flag via dlopen(..), so the symbols with the same name are supposed not to interfere with each other, since the two modules have different names (python27 and python36).

Example:

#include <{...}/include/python2.7/Python.h>

int main(int argc, const char * argv[])
{
    auto* py3 = dlopen(".../python36", RTLD_GLOBAL | RTLD_NOW);
    if (py3 == nullptr)
        return 0;

    auto* py2 = dlopen(".../python27", RTLD_GLOBAL | RTLD_NOW);
    if (py2 == nullptr)
        return 0;

    auto* init = ((decltype(Py_Initialize)*)dlsym(py2, "Py_Initialize"));
    if (init)
    {
        init();
    }

    return 0;
}

The problem is, after python2 imports /path/to/python2/lib/lib-dynload/_locale.so, the function PyModule_GetDict from python3 gets called. Why is that? How can that happen? Shouldn't the two-level namespace prevent that?

P.S. lib-dynload is a directory with additional C-modules for Python on macOS. I verified that the correct _local.so lib from the python2 environment gets loaded.

enter image description here

Edit:

After doing some experiments, I saw that the symbols of the first loaded python lib always get the higher precedence, not sure though if this is intended for first loaded libs or still 'undefined behaviour land'.

Calling Py_Initialize() of python27 - Success:

1. Loading python27 first
2. Loading python36 second
3. PYTHONHOME to python27
4. cal Py_Initialize() of python27

Calling Py_Initialize() of python27 - Crash:

1. Loading python36 first
2. Loading python27 second
3. PYTHONHOME to python27
4. cal Py_Initialize() of python27

I get the same results the other way around.

Calling Py_Initialize() of python36 - Success:

1. Loading python36 first
2. Loading python27 second
3. PYTHONHOME to python36
4. cal Py_Initialize() of python36

Calling Py_Initialize() of python36 - Crash:

1. Loading python27 first
2. Loading python36 second
3. PYTHONHOME to python36
4. cal Py_Initialize() of python36
like image 327
HelloWorld Avatar asked Mar 20 '18 21:03

HelloWorld


1 Answers

The symbols within libpython (e.g. libpython2.7.dylib) are being resolved correctly. For example, under the above described scenario, I see PyModule_GetDict() get called 155 times before the incorrectly resolved call.

The problem is that python itself is compiling shared libraries, and it's using dlopen() to load them. You can see the dlopen() happening by setting the environment variable PYTHONVERBOSE when running:

$ PYTHONVERBOSE=1 ./main 2>&1 | grep dlopen

which produces:

dlopen(".../lib/python2.7/lib-dynload/_locale.so", 2);

The 2 argument corresponds to RTL_NOW, but that doesn't matter too much. The issue is that this separate library isn't able to indicate that it's symbols should be resolved against the libpython2.7.dylib library. Yet, it does have several python symbols; in particular, this one that ends up causing the problem:

$ nm prefix/lib/python2.7/lib-dynload/_locale.so | grep GetDict
         U _PyModule_GetDict

So, when python dlopen()s the library, all it can do is resolve the symbol without the qualification. Apparently, the semantic of the dl functionality is to resolve such symbols based on the order the libraries are loaded, as you noted.

So, things work fine until we load _locale.so, as you can see from the following backtrace:

* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x50)
  * frame #0: 0x00000001003f3fc1 libpython3.6m.dylib`PyErr_FormatV [inlined] PyErr_Restore at errors.c:42 [opt]
    frame #1: 0x00000001003f3fb7 libpython3.6m.dylib`PyErr_FormatV [inlined] PyErr_Clear at errors.c:355 [opt]
    frame #2: 0x00000001003f3fb7 libpython3.6m.dylib`PyErr_FormatV(exception=0x00000001004cba18, format="%s:%d: bad argument to internal function", vargs=0x00007fff5fbfdcb0) at errors.c:841 [opt]
    frame #3: 0x00000001003f2c39 libpython3.6m.dylib`PyErr_Format(exception=<unavailable>, format=<unavailable>) at errors.c:860 [opt]
    frame #4: 0x0000000100358220 libpython3.6m.dylib`PyModule_GetDict(m=0x0000000101a5a868) at moduleobject.c:450 [opt]
    frame #5: 0x00000001000f491c _locale.so`init_locale at _localemodule.c:703 [opt]
    frame #6: 0x00000001018d1176 libpython2.7.dylib`_PyImport_LoadDynamicModule(name="_locale", pathname=".../lib/python2.7/lib-dynload/_locale.so", fp=<unavailable>) at importdl.c:53 [opt]

Also worth noting, _locale.so is just the first library to fail. If you got past it somehow, there are quite a few other libraries that potentially will have similar problems in .../lib/python2.7/lib-dynload.

like image 161
cryptoplex Avatar answered Sep 29 '22 09:09

cryptoplex