I recently discovered a strange behaviour using std::thread
and dlopen
.
Basically, when I execute a std::thread
in a library which is loaded using dlopen
I receive a sigsev. The library itself is linked against pthread, the executable that calls dlopen
is not.
Once I link the executable against pthread
or the library itself everything works fine. However, we are using a plugin based infrastructure, where we do not know if the application itself is linked against pthread or not. Therefore, it is not an option to link the executable always against pthread.
Please find attached some code to reproduce the Issue. Currently I am not sure what causes the Issue. Is it a problem of gcc, glibc, libstdc++, or the ld.so? Is there a convenient way to work around this? I looks like this glibc bug is related, but I am using glibc2.27 (debian testing).
Calling pthread_create
itself from the library seems to work.
#include <thread>
#include <iostream>
void thread()
{
std::thread t ([](){std::cout << "hello world" << std::endl;});
t.join();
}
extern "C" {
void hello()
{
thread();
}
}
#include <iostream>
#include <dlfcn.h>
/** code from https://www.tldp.org/HOWTO/html_single/C++-dlopen/
*/
int main() {
std::cout << "C++ dlopen demo\n\n";
// open the library
std::cout << "Opening hello.so...\n";
void* handle = dlopen("./libhello.so", RTLD_LAZY);
if (!handle) {
std::cerr << "Cannot open library: " << dlerror() << '\n';
return 1;
}
// load the symbol
std::cout << "Loading symbol hello...\n";
typedef void (*hello_t)();
// reset errors
dlerror();
hello_t hello = (hello_t) dlsym(handle, "hello");
const char *dlsym_error = dlerror();
if (dlsym_error) {
std::cerr << "Cannot load symbol 'hello': " << dlsym_error <<
'\n';
dlclose(handle);
return 1;
}
// use it to do the calculation
std::cout << "Calling hello...\n";
hello();
// close the library
std::cout << "Closing library...\n";
dlclose(handle);
}
#!/bin/bash
echo "g++ -shared -fPIC -std=c++14 hello.cpp -o libhello.so -pthread"
g++ -shared -fPIC -std=c++14 hello.cpp -o libhello.so -pthread
echo "g++ example.cpp -o example1 -ldl"
g++ example.cpp -o example1 -ldl
echo "g++ example.cpp -o example2 -ldl -pthread"
g++ example.cpp -o example2 -ldl -pthread
echo "g++ example.cpp -o example3 -ldl -lhello -L ./"
g++ example.cpp -o example3 -ldl -lhello -L ./
export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:$(pwd)
echo "===== example1 ====="
./example1
echo "===== end ====="
echo "===== example2 ====="
./example2
echo "===== end ====="
echo "===== example3 ====="
./example3
echo "===== end ====="
I forgot to mention: If I am running the faulty example (i.e. example 1) using LD_DEBUG=all
the program crashes during the lookup of pthread_create
. Even more interesting is that a former lookup of pthread_create
succeeds:
8111: symbol=_ZSt4endlIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_; lookup in file=/usr/lib/x86_64-linux-gnu/libstdc++.so.6 [0]
8111: binding file ./libhello.so [0] to /usr/lib/x86_64-linux-gnu/libstdc++.so.6 [0]: normal symbol `_ZSt4endlIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_' [GLIBCXX_3.4]
8111: symbol=pthread_create; lookup in file=./example1 [0]
8111: symbol=pthread_create; lookup in file=/lib/x86_64-linux-gnu/libdl.so.2 [0]
8111: symbol=pthread_create; lookup in file=/usr/lib/x86_64-linux-gnu/libstdc++.so.6 [0]
8111: symbol=pthread_create; lookup in file=/lib/x86_64-linux-gnu/libm.so.6 [0]
8111: symbol=pthread_create; lookup in file=/lib/x86_64-linux-gnu/libgcc_s.so.1 [0]
8111: symbol=pthread_create; lookup in file=/lib/x86_64-linux-gnu/libc.so.6 [0]
8111: symbol=pthread_create; lookup in file=/lib64/ld-linux-x86-64.so.2 [0]
8111: symbol=pthread_create; lookup in file=./libhello.so [0]
8111: symbol=pthread_create; lookup in file=/usr/lib/x86_64-linux-gnu/libstdc++.so.6 [0]
8111: symbol=pthread_create; lookup in file=/lib/x86_64-linux-gnu/libm.so.6 [0]
8111: symbol=pthread_create; lookup in file=/lib/x86_64-linux-gnu/libgcc_s.so.1 [0]
8111: symbol=pthread_create; lookup in file=/lib/x86_64-linux-gnu/libpthread.so.0 [0]
8111: binding file ./libhello.so [0] to /lib/x86_64-linux-gnu/libpthread.so.0 [0]: normal symbol `pthread_create' [GLIBC_2.2.5]
8111: symbol=_ZTVNSt6thread6_StateE; lookup in file=./example1 [0]
8111: symbol=_ZTVNSt6thread6_StateE; lookup in file=/lib/x86_64-linux-gnu/libdl.so.2 [0]
8111: symbol=_ZTVNSt6thread6_StateE; lookup in file=/usr/lib/x86_64-linux-gnu/libstdc++.so.6 [0]
8111: binding file ./libhello.so [0] to /usr/lib/x86_64-linux-gnu/libstdc++.so.6 [0]: normal symbol `_ZTVNSt6thread6_StateE' [GLIBCXX_3.4.22]
...
8111: binding file ./libhello.so [0] to ./libhello.so [0]: normal symbol `_ZNSt10_Head_baseILm0EPNSt6thread6_StateELb0EE7_M_headERS3_'
8111: symbol=_ZNSt6thread15_M_start_threadESt10unique_ptrINS_6_StateESt14default_deleteIS1_EEPFvvE; lookup in file=./example1 [0]
8111: symbol=_ZNSt6thread15_M_start_threadESt10unique_ptrINS_6_StateESt14default_deleteIS1_EEPFvvE; lookup in file=/lib/x86_64-linux-gnu/libdl.so.2 [0]
8111: symbol=_ZNSt6thread15_M_start_threadESt10unique_ptrINS_6_StateESt14default_deleteIS1_EEPFvvE; lookup in file=/usr/lib/x86_64-linux-gnu/libstdc++.so.6 [0]
8111: binding file ./libhello.so [0] to /usr/lib/x86_64-linux-gnu/libstdc++.so.6 [0]: normal symbol `_ZNSt6thread15_M_start_threadESt10unique_ptrINS_6_StateESt14default_deleteIS1_EEPFvvE' [GLIBCXX_3.4.22]
8111: symbol=pthread_create; lookup in file=./example1 [0]
8111: symbol=pthread_create; lookup in file=/lib/x86_64-linux-gnu/libdl.so.2 [0]
8111: symbol=pthread_create; lookup in file=/usr/lib/x86_64-linux-gnu/libstdc++.so.6 [0]
8111: symbol=pthread_create; lookup in file=/lib/x86_64-linux-gnu/libm.so.6 [0]
8111: symbol=pthread_create; lookup in file=/lib/x86_64-linux-gnu/libgcc_s.so.1 [0]
8111: symbol=pthread_create; lookup in file=/lib/x86_64-linux-gnu/libc.so.6 [0]
8111: symbol=pthread_create; lookup in file=/lib64/ld-linux-x86-64.so.2 [0]
./build.sh: line 18: 8111 Segmentation fault (core dumped) LD_DEBUG=all ./example1
===== end =====
The problem lies in libstdc++.
So one solution would be to switch to libc++. Obviously this only works if one never export any interface that relies on any std::
type. In particular, a library that only exports C-compatible interfaces should be OK.
Another solution would be to have your library loaded with RTLD_GLOBAL (you may have to separate it in two, the main one and a small stub that just loads the main one with RTLD_GLOBAL).
In parallel one should file a bug against libstdc++ and wait for a fix. There's no reason why it should be broken like that.
If none of the above are viable options, then the only solution seems to involve a complete isolation between the caller and the multithreaded module. Make the multithreaded module a separate executable, fork-exec it from your plugin, marshal arguments/results to/from it via pipes.
Finally, there's always the ugly workaround of preloading libpthread in the caller program.
I can offer some background as to why there is a segfault, but unfortunately no solution.
It seems that this is an issue with libstdc++
: Technically this huge monolithic library depends on libpthread
, but for good reasons, they do not link against libpthread
. Now in order to be able to load libstdc++
from programs that don't use threads at all, the missing symbols (e.g. pthread_create
) must come somewhere. So libstdc++
defines them as weak symbols.
These weak symbols are also used to detect at runtime whether libpthread
is actually loaded. For an old ABI there even was a check in _M_start_thread
which caused a meaningful exception if pthread was not loaded instead of calling a weakly defined nullptr
- something I would not wish upon my worst enemey.
Unfortunately that run-time check got lost for the new ABI. Instead, there is a link-time check for pthread_create
by creating a dependency when compiling the code which calls _M_start_thread
, and passing a pointer to pthread_create
into this function. Unfortunately that pointer is discarded and the still weakly nullptr
pointer is used.
Now something during the linking/loading causes the weakly defined pthread_create
to not be overridden in your problematic case. I am unsure about the exact resolution rules that apply there - I assume it has to do with libstdc++
being already fully loaded when libpthread
is being loaded. I would be glad if any additional answer would clarify that. Unfortunately there also seems to be no generally viable option to fix that other than linking the main application with -lpthread
or LD_PRELOAD=libpthread.so
(which I wouldn't really recommend).
Once I link the executable against pthread or the library itself everything works fine. However, we are using a plugin based infrastructure, where we do not know if the application itself is linked against pthread or not. Therefore, it is not an option to link the executable always against pthread.
On the contrary: very few systems support an application becoming "suddenly multithreaded" (your system obviously doesn't).
If you need to support potentially multithreaded plugin, then you must start out multithread-ready, which is achieved by linking against libpthread
, or more portably by adding -pthread
flag to compile and link lines for the main executable.
Is it a problem of gcc, glibc, libstdc++, or the ld.so
It's a problem with libstdc++
-- GLIBC does support "suddenly multithreaded" execution, GCC isn't part of the runtime environment at all, and ld.so
is part of GLIBC.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With