How does dynamic linking work generally?
On Windows (LoadLibrary), you need a .dll to call at runtime, but at link time, you need to provide a corresponding .lib file or the program won't link... What does the .lib file contain? A description of the .dll methods? Isn't that what the headers contain?
Relatedly, on *nix, you don't need a lib file... How how does the compiler know that the methods described in the header will be available at runtime?
As a newbie, when you think about either one of the two schemes, then the other, neither of them make sense...
Dynamic Links are smart URLs that allow you to send existing and potential users to any location within your iOS or Android app. They survive the app install process, so even new users see the content they're looking for when they open the app for the first time. Dynamic Links are no-cost forever , for any scale.
Dynamic linking is the most common method, especially on Linux systems. Dynamic linking keeps libraries modular, so just one library can be shared between any number of applications. Modularity also allows a shared library to be updated independently of the applications that rely upon it.
The main difference between static and dynamic linking is that static linking copies all library modules used in the program into the final executable file at the final step of the compilation while, in dynamic linking, the linking occurs at run time when both executable files and libraries are placed in the memory.
The dynamic linker is the program that manages shared dynamic libraries on behalf of an executable. It works to load libraries into memory and modify the program at runtime to call the functions in the library.
To answer your questions one by one:
Dynamic linking defers part of the linking process to runtime.
It can be used in two ways: implicitly and explicitly.
Implicitly, the static linker will insert information into the
executable which will cause the library to load and resolve the
necessary symbols. Explicitly, you must call LoadLibrary
or
dlopen
manually, and then GetProcAddress
/dlsym
for each
symbol you need to use. Implicit loading is used for things
like the system library, where the implementation will depend on
the version of the system, but the interface is guaranteed.
Explicit loading is used for things like plug-ins, where the
library to be loaded will be determined at runtime.
The .lib
file is only necessary for implicit loading. It
contains the information that the library actually provides this
symbol, so the linker won't complain that the symbol is
undefined, and it tells the linker in what library the symbols
are located, so it can insert the necessary information to cause
this library to automatically be loaded. All the header files
tell the compiler is that the symbols will exist, somewhere; the
linker needs the .lib
to know where.
Under Unix, all of the information is extracted from the
.so
. Why Windows requires two separate files, rather than
putting all of the information in one file, I don't know; it's
actually duplicating most of the information, since the
information needed in the .lib
is also needed in the .dll
.
(Perhaps licensing issues. You can distribute your program with
the .dll
, but no one can link against the libraries unless
they have a .lib
.)
The main thing to retain is that if you want implicit loading,
you have to provide the linker with the appropriate information,
either with a .lib
or a .so
file, so that it can insert that
information into the executable. And that if you want explicit
loading, you can't refer to any of the symbols in the library
directly; you have to call GetProcAddress
/dlsym
to get their
addresses yourself (and do some funny casting to use them).
The .lib
file on Windows is not required for loading a dynamic library, it merely offers a convenient way of doing so.
In principle, you can use LoadLibrary
for loading the dll and then use GetProcAddress
for accessing functions provided by that dll. The compilation of the enclosing program does not need to access the dll in that case, it is only needed at runtime (ie. when LoadLibrary
actually executes). MSDN has a code example.
The disadvantage here is that you need to manually write code for loading the functions from the dll. In case you compiled the dll yourself in the first place, this code simply duplicates knowledge that the compiler could have extracted from the dll source code automatically (like the names and signatures of exported functions).
This is what the .lib
file does: It contains the GetProcAddress
calls for the Dlls exported functions, generated by the compiler so you don't have to worry about it. In Windows terms, this is called Load-Time Dynamic Linking, since the Dll is loaded automatically by the code from the .lib file when your enclosing program is loaded (as opposed to the manual approach, referred to as run-time dynamic linking).
How does dynamic linking work generally?
The dynamic link library (aka shared object) file contains machine code instructions and data, along with a table of metadata saying which offsets in that code/data relate to which "symbols", the type of the symbol (e.g. function vs data), the number of bytes or words in the data, and a few other things. Different OS will tend to have different shared object file formats, and indeed the same OS may support several, but that's the gist of it.
So, imagine the shared library's a big chunk of bytes with an index like this:
SYMBOL ADDRESS TYPE SIZE
my_function 1000 function 2893
my_number 4800 variable 4
In general, the exact type of the symbols need not be captured in the metadata table - it's expected that declarations in the library's header files contain all the missing information. C++ is a bit special - compared to say C - because overloading can mean there are several functions with the same name, and namespaces allow for further symbols that would otherwise be ambiguously named - for that reason name mangling is typically used to concatenate some representation of the namespace and function arguments to the function name, forming something that can be unique in the library object file.
A program wanting to use the shared object can generally do one of two things:
have the OS load both itself and the shared object around the same time (before executing main()
), with the OS Loader responsible for finding the symbols and examining metadata in the program file image about the use of those symbols, then patching in symbol addresses in the memory the program uses, such that the program can then just run and work functionally as if it'd known about the symbol addresses when it was first compiled (but perhaps a little slower)
or, explicitly in its own source code call dlopen
sometime after main
runs, then use dlsym
or similar to get the symbol addresses, save them into (function/data) pointers based on the programmer's knowledge of the expected data types, then call them explicitly using the pointers.
On Windows (LoadLibrary), you need a .dll to call at runtime, but at link time, you need to provide a corresponding .lib file or the program won't link...
That doesn't sound right. Should be one or the other I'd think.
Wtf does the .lib file contain? A description of the .dll methods? Isn't that what the headers contain?
A lib file is - at this level of description - pretty much the same as a shared object file... the main difference is that the compiler's finding the symbol addresses before the program's shipped and run.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With