Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

using std::thread in a library loaded with dlopen leads to a sigsev

I recently discovered a strange behaviour using std::thread and dlopen.

Basically, when I execute a std::thread in a library which is loaded using dlopen I receive a sigsev. The library itself is linked against pthread, the executable that calls dlopen is not.

Once I link the executable against pthread or the library itself everything works fine. However, we are using a plugin based infrastructure, where we do not know if the application itself is linked against pthread or not. Therefore, it is not an option to link the executable always against pthread.

Please find attached some code to reproduce the Issue. Currently I am not sure what causes the Issue. Is it a problem of gcc, glibc, libstdc++, or the ld.so? Is there a convenient way to work around this? I looks like this glibc bug is related, but I am using glibc2.27 (debian testing).

Calling pthread_create itself from the library seems to work.

hello.cpp

#include <thread>
#include <iostream>

void thread()
{
    std::thread t ([](){std::cout << "hello world" << std::endl;});
    t.join();
}

extern "C" {
    void hello()
    {
        thread();
    }
}

example.cpp

#include <iostream>
#include <dlfcn.h>

/** code from https://www.tldp.org/HOWTO/html_single/C++-dlopen/
*/
int main() {

    std::cout << "C++ dlopen demo\n\n";

    // open the library
    std::cout << "Opening hello.so...\n";
    void* handle = dlopen("./libhello.so", RTLD_LAZY);

    if (!handle) {
        std::cerr << "Cannot open library: " << dlerror() << '\n';
        return 1;
    }

    // load the symbol
    std::cout << "Loading symbol hello...\n";
    typedef void (*hello_t)();

    // reset errors
    dlerror();
    hello_t hello = (hello_t) dlsym(handle, "hello");
    const char *dlsym_error = dlerror();
    if (dlsym_error) {
        std::cerr << "Cannot load symbol 'hello': " << dlsym_error <<
            '\n';
        dlclose(handle);
        return 1;
    }

    // use it to do the calculation
    std::cout << "Calling hello...\n";
    hello();

    // close the library
    std::cout << "Closing library...\n";
    dlclose(handle);
}

build.sh (build and execute the upper example. Example 1 crashes)

#!/bin/bash

echo "g++ -shared -fPIC -std=c++14 hello.cpp -o libhello.so -pthread"
g++ -shared -fPIC -std=c++14 hello.cpp -o libhello.so -pthread

echo "g++ example.cpp -o example1 -ldl"
g++ example.cpp -o example1 -ldl

echo "g++ example.cpp -o example2 -ldl -pthread"
g++ example.cpp -o example2 -ldl -pthread

echo "g++ example.cpp -o example3 -ldl -lhello -L ./"
g++ example.cpp -o example3 -ldl -lhello -L ./

export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:$(pwd)

echo "===== example1 ====="
./example1
echo "===== end      ====="

echo "===== example2 ====="
./example2
echo "===== end      ====="

echo "===== example3 ====="
./example3
echo "===== end      ====="

EDIT

I forgot to mention: If I am running the faulty example (i.e. example 1) using LD_DEBUG=all the program crashes during the lookup of pthread_create. Even more interesting is that a former lookup of pthread_create succeeds:

  8111:     symbol=_ZSt4endlIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_;  lookup in file=/usr/lib/x86_64-linux-gnu/libstdc++.so.6 [0]
  8111:     binding file ./libhello.so [0] to /usr/lib/x86_64-linux-gnu/libstdc++.so.6 [0]: normal symbol `_ZSt4endlIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_' [GLIBCXX_3.4]
  8111:     symbol=pthread_create;  lookup in file=./example1 [0]
  8111:     symbol=pthread_create;  lookup in file=/lib/x86_64-linux-gnu/libdl.so.2 [0]
  8111:     symbol=pthread_create;  lookup in file=/usr/lib/x86_64-linux-gnu/libstdc++.so.6 [0]
  8111:     symbol=pthread_create;  lookup in file=/lib/x86_64-linux-gnu/libm.so.6 [0]
  8111:     symbol=pthread_create;  lookup in file=/lib/x86_64-linux-gnu/libgcc_s.so.1 [0]
  8111:     symbol=pthread_create;  lookup in file=/lib/x86_64-linux-gnu/libc.so.6 [0]
  8111:     symbol=pthread_create;  lookup in file=/lib64/ld-linux-x86-64.so.2 [0]
  8111:     symbol=pthread_create;  lookup in file=./libhello.so [0]
  8111:     symbol=pthread_create;  lookup in file=/usr/lib/x86_64-linux-gnu/libstdc++.so.6 [0]
  8111:     symbol=pthread_create;  lookup in file=/lib/x86_64-linux-gnu/libm.so.6 [0]
  8111:     symbol=pthread_create;  lookup in file=/lib/x86_64-linux-gnu/libgcc_s.so.1 [0]
  8111:     symbol=pthread_create;  lookup in file=/lib/x86_64-linux-gnu/libpthread.so.0 [0]
  8111:     binding file ./libhello.so [0] to /lib/x86_64-linux-gnu/libpthread.so.0 [0]: normal symbol `pthread_create' [GLIBC_2.2.5]
  8111:     symbol=_ZTVNSt6thread6_StateE;  lookup in file=./example1 [0]
  8111:     symbol=_ZTVNSt6thread6_StateE;  lookup in file=/lib/x86_64-linux-gnu/libdl.so.2 [0]
  8111:     symbol=_ZTVNSt6thread6_StateE;  lookup in file=/usr/lib/x86_64-linux-gnu/libstdc++.so.6 [0]
  8111:     binding file ./libhello.so [0] to /usr/lib/x86_64-linux-gnu/libstdc++.so.6 [0]: normal symbol `_ZTVNSt6thread6_StateE' [GLIBCXX_3.4.22]
  ...
  8111:     binding file ./libhello.so [0] to ./libhello.so [0]: normal symbol `_ZNSt10_Head_baseILm0EPNSt6thread6_StateELb0EE7_M_headERS3_'
  8111:     symbol=_ZNSt6thread15_M_start_threadESt10unique_ptrINS_6_StateESt14default_deleteIS1_EEPFvvE;  lookup in file=./example1 [0]
  8111:     symbol=_ZNSt6thread15_M_start_threadESt10unique_ptrINS_6_StateESt14default_deleteIS1_EEPFvvE;  lookup in file=/lib/x86_64-linux-gnu/libdl.so.2 [0]
  8111:     symbol=_ZNSt6thread15_M_start_threadESt10unique_ptrINS_6_StateESt14default_deleteIS1_EEPFvvE;  lookup in file=/usr/lib/x86_64-linux-gnu/libstdc++.so.6 [0]
  8111:     binding file ./libhello.so [0] to /usr/lib/x86_64-linux-gnu/libstdc++.so.6 [0]: normal symbol `_ZNSt6thread15_M_start_threadESt10unique_ptrINS_6_StateESt14default_deleteIS1_EEPFvvE' [GLIBCXX_3.4.22]
  8111:     symbol=pthread_create;  lookup in file=./example1 [0]
  8111:     symbol=pthread_create;  lookup in file=/lib/x86_64-linux-gnu/libdl.so.2 [0]
  8111:     symbol=pthread_create;  lookup in file=/usr/lib/x86_64-linux-gnu/libstdc++.so.6 [0]
  8111:     symbol=pthread_create;  lookup in file=/lib/x86_64-linux-gnu/libm.so.6 [0]
  8111:     symbol=pthread_create;  lookup in file=/lib/x86_64-linux-gnu/libgcc_s.so.1 [0]
  8111:     symbol=pthread_create;  lookup in file=/lib/x86_64-linux-gnu/libc.so.6 [0]
  8111:     symbol=pthread_create;  lookup in file=/lib64/ld-linux-x86-64.so.2 [0]
  ./build.sh: line 18:  8111 Segmentation fault      (core dumped) LD_DEBUG=all ./example1
  ===== end      =====
like image 644
A. Gocht Avatar asked Jul 06 '18 11:07

A. Gocht


3 Answers

The problem lies in libstdc++.

  • With C programs this doesn't happen.
  • With C++ programs built with libc++ this also doesn't happen.
  • With C++ programs built with libstdc++ statically this also doesn't happen.
  • With libraries build with libc++ this doesn't happen even if the caller program is built with libstdc++ dynamically.
  • When the program dlopens the library with RTLD_GLOBAL, this also doesn't happen.

So one solution would be to switch to libc++. Obviously this only works if one never export any interface that relies on any std:: type. In particular, a library that only exports C-compatible interfaces should be OK.

Another solution would be to have your library loaded with RTLD_GLOBAL (you may have to separate it in two, the main one and a small stub that just loads the main one with RTLD_GLOBAL).

In parallel one should file a bug against libstdc++ and wait for a fix. There's no reason why it should be broken like that.

If none of the above are viable options, then the only solution seems to involve a complete isolation between the caller and the multithreaded module. Make the multithreaded module a separate executable, fork-exec it from your plugin, marshal arguments/results to/from it via pipes.

Finally, there's always the ugly workaround of preloading libpthread in the caller program.

like image 166
n. 1.8e9-where's-my-share m. Avatar answered Sep 17 '22 12:09

n. 1.8e9-where's-my-share m.


I can offer some background as to why there is a segfault, but unfortunately no solution.

It seems that this is an issue with libstdc++: Technically this huge monolithic library depends on libpthread, but for good reasons, they do not link against libpthread. Now in order to be able to load libstdc++ from programs that don't use threads at all, the missing symbols (e.g. pthread_create) must come somewhere. So libstdc++ defines them as weak symbols.

These weak symbols are also used to detect at runtime whether libpthread is actually loaded. For an old ABI there even was a check in _M_start_thread which caused a meaningful exception if pthread was not loaded instead of calling a weakly defined nullptr - something I would not wish upon my worst enemey.

Unfortunately that run-time check got lost for the new ABI. Instead, there is a link-time check for pthread_create by creating a dependency when compiling the code which calls _M_start_thread, and passing a pointer to pthread_create into this function. Unfortunately that pointer is discarded and the still weakly nullptr pointer is used.

Now something during the linking/loading causes the weakly defined pthread_create to not be overridden in your problematic case. I am unsure about the exact resolution rules that apply there - I assume it has to do with libstdc++ being already fully loaded when libpthread is being loaded. I would be glad if any additional answer would clarify that. Unfortunately there also seems to be no generally viable option to fix that other than linking the main application with -lpthread or LD_PRELOAD=libpthread.so (which I wouldn't really recommend).

like image 7
Zulan Avatar answered Oct 18 '22 12:10

Zulan


Once I link the executable against pthread or the library itself everything works fine. However, we are using a plugin based infrastructure, where we do not know if the application itself is linked against pthread or not. Therefore, it is not an option to link the executable always against pthread.

On the contrary: very few systems support an application becoming "suddenly multithreaded" (your system obviously doesn't).

If you need to support potentially multithreaded plugin, then you must start out multithread-ready, which is achieved by linking against libpthread, or more portably by adding -pthread flag to compile and link lines for the main executable.

Is it a problem of gcc, glibc, libstdc++, or the ld.so

It's a problem with libstdc++ -- GLIBC does support "suddenly multithreaded" execution, GCC isn't part of the runtime environment at all, and ld.so is part of GLIBC.

like image 4
Employed Russian Avatar answered Oct 18 '22 12:10

Employed Russian