My question concerns the use of OpenMP in C++ functions stored in dynamic libraries. Let's consider the following code (in shared.cpp):
#include "omp.h"
#include <iostream>
extern "C" {
int test() {
int N = omp_get_max_threads();
#pragma omp parallel num_threads(N)
{
std::cout << omp_get_thread_num() << std::endl;
}
return 0;
}
};
I compile this code using g++: g++ -fopenmp -shared -fPIC -o shared.so shared.cpp. Then, to use the test function, I have the following program (main.cpp):
#include <iostream>
#include <dlfcn.h>
int main() {
void* handle = dlopen("./shared.so", RTLD_NOW);
if (!handle) {
std::cerr << "can not open shared.so" << std::endl;
return 1;
}
int(*f)() = (int(*)()) dlsym(handle,"test");
if (!f) {
std::cerr << "can not find 'test' symbol in shared.so" << std::endl;
return 1;
}
(*f)();
if (dlclose(handle)) {
std::cerr << "can not close shared.so" << std::endl;
return 1;
}
return 0;
}
compiled with the command: g++ -o main main.cpp -ldl The problem is that a segmentation fault occurs at the very end of the program execution. According to valgrind, some threads are still active at this point, which seems to be coherent with the OpenMP behavior.
One solution (for C code) from this post is to compile the program using the gcc -fopenmp flag, but g++ seems smart enough to detect that OpenMP is never used in that program, and the OpenMP environment is never loaded (the assembly code of both versions is equal). The only workaround I've found is to make a useless call to OpenMP in the program, which forces g++ to load the OpenMP environment, and the execution is then correct. But for me this workaround is quite ugly. I've tried g++-4.8.2, g++-4.8.1, g++-4.7.3 and g++-4.6.4. (With icc-14, using -openmp option on the program actually fix the problem).
Does anyone has ever faced this problem ? Is there a cleaner workaround ? Thanks, Thomas
Edit Tried with G++-4.9.2 : still failing
I think you're seeing an issue with libgomp, the OpenMP runtime library of GCC. Try linking with it: g++ -o main main.cpp -ldl -lgomp and your segfault will be gone.
libgomp has some internal state that is initialized on the first OpenMP call. For some reason, the de-initialized doesn't happen if you dynamically load a OpenMP library. It sounds like a bug to me.
The intel compiler has it own OpenMP runtime (libiomp5) which does not have this problem.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With