There is a mystery that I am trying to understand:
I have made an application that can be extended with dynamic libraries which contain some code which however need to access some functions that are defined in the application itself. To make it clear:
I have application let's call it APP, then I have extension EXT. APP is extended with some features that are implemented in EXT but EXT needs to call some functions that are defined in APP in order to "hook" to it (for example register new items in APP layout etc). In MS Windows I wouldn't be able to compile EXT because of unresolved symbols - that makes sense - how would I call functions that are in APP without actually having anything to link these with, so I created a dll library of APP that is basically APP just built as a DLL with all these functions that I need to access exported using __declspec(dllexport) (let's call it just LIB), so it works like this:
APP loads EXT and EXT is calling APP functions through LIB. It's a nasty solution at some point but I couldn't think of any better. And what's most important - it works perfect.
Now what drives me mad is howcome this all works on linux without having to create LIB? This windows thing is nasty but it makes perfect sense, however on linux I can build EXT even without having to build APP or LIB, it just somehow ignore these unresolved symbols and link it anyway. The whole library contains them, I can verify that by calling:
ld: warning: cannot find entry symbol _start; not setting start address
libhuggle_md.so: undefined reference to `Huggle::Query::NetworkManager'
libhuggle_md.so: undefined reference to `Huggle::Syslog::HuggleLogs'
libhuggle_md.so: undefined reference to `Huggle::Core::HuggleCore'
libhuggle_md.so: undefined reference to `Huggle::QueryPool::HugglePool'
libhuggle_md.so: undefined reference to `Huggle::Localizations::HuggleLocalizations'
libhuggle_md.so: undefined reference to `Huggle::Configuration::HuggleConfiguration'
libhuggle_md.so: undefined reference to `Huggle::GC::gc'
libhuggle_md.so: undefined reference to `Huggle::WikiUser::WikiUser(QString)'
libhuggle_md.so: undefined reference to `Huggle::WikiUtil::MessageUser(Huggle::WikiUser*, QString, QString, QString, bool, Huggle::Query*, bool, bool, bool, QString, bool, bool)'
So you can see that the EXT is refering to some functions of the APP but it was never linked to any library that would implement them. They are just unresolved.
When I load EXT in APP some magic inside of kernel somehow happens and all get magically working. Why the APP on linux doesn't need LIB while windows do need it? Why it's possible to link something on linux with unresolved external symbols? How does it know which symbols I am refering to? Does it find them in APP and resolve them runtime?
For anyone interested here is a complete source: https://github.com/huggle/huggle3-qt-lx if you clone this on linux and run ./configure --extension
and then make you will see that it first build one of extensions (even if there is nothing to link with) then it creates the application, and if you run make install
and then try to run it, you will see that it loads just fine and using some magic it fix the unresolved symbols within library during runtime. How does this work? And why it doesn't work in Windows?
dlopen() The function dlopen() loads the dynamic shared object (shared library) file named by the null-terminated string filename and returns an opaque "handle" for the loaded object. This handle is employed with other functions in the dlopen API, such as dlsym(3), dladdr(3), dlinfo(3), and dlclose().
When a program linked with shared libraries runs, program execution does not immediately start with that program's first statement. Instead, the operating system loads and executes the dynamic linker (usually called ld.so), which then scans the list of library names embedded in the executable.
The unresolved external symbol is a linker error that indicates it cannot find the symbol or its reference during the linking process.
A successful dlopen() returns a handle which the caller may use on subsequent calls to dlsym() and dlclose(). The value of this handle should not be interpreted in any way by the caller. file is used to construct a pathname to the object file.
I think it is related to ELF format used for executables and libraries in linux (and many other *NIXes) and dynamic linker.
When dynamically linked program is started (it's process is created), dynamic linker prepares address space of this process. Linux libraries are compiled using PIC (Position Independent Code), so they can be placed anywhere in the process address space. Links between functions from different modules at runtime are resolved via PLT (Procedure Lookup) and GOT (Global Offset) tables. PLT (read only, executable section) holds indirect jump instructions to addresses in GOT (read-write, non-executable section) table. First call to function via PLT leads to jump to some runtime linker function, which updates GOT entry (and jumps to real address). Subsequent calls to the same function jump directly to it.
As I understand, compiler has enough information (function prototypes and other data from header files) to build the library correctly. But to build an executable, You'll have to provide all required libraries (yet at runtime You can change libraries used as long as they provide all used functions).
I assume dynamic linking works like this an other UNIX like OSes, which use ELF format.
I'm not very familiar with Windows executable format, so I can't comment why similar trick doesn't work there.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With