For context: I have a Java project that is partially implemented with two JNI libraries. For the sake of example, libbar.so
depends on libfoo.so
. If these were system libraries,
System.loadLibrary("bar");
would do the trick. But since they're custom libraries I'm shipping with my JAR, I have to do something like
System.load("/path/to/libfoo.so");
System.load("/path/to/libbar.so");
libfoo needs to go first because otherwise libbar
can't find it, as it's not in the system library search path.
This has been working well for a while, but I've now run into an issue where std::any_cast
is throwing std::bad_any_cast
despite the types being correct. I tracked it down to the fact that both libraries have a different definition of the typeinfo for that type, and they're not being merged at runtime. This seems to be because System.load()
ends up invoking dlopen()
with RTLD_LOCAL
rather than RTLD_GLOBAL
.
I wrote this to demonstrate the behaviour without needing JNI:
foo.hpp
class foo { }; extern "C" const void* libfoo_foo_typeinfo();
foo.cpp
#include "foo.hpp" #include <typeinfo> extern "C" const void* libfoo_foo_typeinfo() { return &typeid(foo); }
bar.cpp
#include "foo.hpp" #include <typeinfo> extern "C" const void* libbar_foo_typeinfo() { return &typeid(foo); }
main.cpp
#include <iostream> #include <typeinfo> #include <dlfcn.h> int main() { void* libfoo = dlopen("./libfoo.so", RTLD_NOW | RTLD_LOCAL); void* libbar = dlopen("./libbar.so", RTLD_NOW | RTLD_LOCAL); auto libfoo_fn = reinterpret_cast<const void* (*)()>( dlsym(libfoo, "libfoo_foo_typeinfo")); auto libbar_fn = reinterpret_cast<const void* (*)()>( dlsym(libbar, "libbar_foo_typeinfo")); auto libfoo_ti = static_cast<const std::type_info*>(libfoo_fn()); auto libbar_ti = static_cast<const std::type_info*>(libbar_fn()); std::cout << std::boolalpha << (libfoo_ti == libbar_ti) << "\n" << (*libfoo_ti == *libbar_ti) << "\n"; return 0; }
Makefile
all: libfoo.so libbar.so main libfoo.so: foo.cpp $(CXX) -fpic -shared -Wl,-soname=$@ $^ -o $@ libbar.so: bar.cpp $(CXX) -fpic -shared -Wl,-soname=$@ $^ -L. -lfoo -o $@ main: main.cpp $(CXX) $^ -ldl -o $@
On my system, I get
$ make
...
$ ./main
false
true
This is because even though the typeinfo addresses are different, GCC's libstdc++ uses the mangled names for equality. On LLVM's libc++, for example, equality is based on the typeinfo address itself, so I get:
$ make CXX="clang++ -stdlib=libc++"
$ ./main
false
false
If I pass RTLD_GLOBAL
instead, I see
true
true
And if I edit main.cpp
to load libbar.so
first, it also works, provided I tell it where it can find libfoo.so
:
$ LD_LIBRARY_PATH=. ./main
true
true
But for the reasons described at the top of this post, neither of these is a practical workaround.
This is very similar to https://github.com/android-ndk/ndk/issues/533 but with non-dynamic types, so there's no way to add a "key function" to force the typeinfo to be a strong symbol. I happened to reproduce the problem on Android first, but it isn't Android-specific.
No, that is not possible. RTLD_LOCAL
seeks to prevent exactly that, and unfortunately must be used for System.loadLibrary
since otherwise bad things will happen if you System.loadLibrary
two libraries that each define different foo
classes.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With