How does dynamic linking work generally? On Windows (LoadLibrary), you need a .dll to call at runtime, but at link time, you need to provide a corresponding .lib file or the program won't link... What does the .lib file contain? A description of the .dll methods? Isn't that what the headers contain? Relatedly, on *nix, you don't need a lib file... How how does the compiler know that the methods described in the header will be available at runtime? As a newbie, when you think about either one of the two schemes, then the other, neither of them make sense...

To answer your questions one by one: <ul> <li>Dynamic linking defers part of the linking process to runtime. It can be used in two ways: implicitly and explicitly. Implicitly, the static linker will insert information into the executable which will cause the library to load and resolve the necessary symbols. Explicitly, you must call <code>LoadLibrary</code> or <code>dlopen</code> manually, and then <code>GetProcAddress</code>/<code>dlsym</code> for each symbol you need to use. Implicit loading is used for things like the system library, where the implementation will depend on the version of the system, but the interface is guaranteed. Explicit loading is used for things like plug-ins, where the library to be loaded will be determined at runtime.</li> <li>The <code>.lib</code> file is only necessary for implicit loading. It contains the information that the library actually provides this symbol, so the linker won't complain that the symbol is undefined, and it tells the linker in what library the symbols are located, so it can insert the necessary information to cause this library to automatically be loaded. All the header files tell the compiler is that the symbols will exist, somewhere; the linker needs the <code>.lib</code> to know where.</li> <li>Under Unix, all of the information is extracted from the <code>.so</code>. Why Windows requires two separate files, rather than putting all of the information in one file, I don't know; it's actually duplicating most of the information, since the information needed in the <code>.lib</code> is also needed in the <code>.dll</code>. (Perhaps licensing issues. You can distribute your program with the <code>.dll</code>, but no one can link against the libraries unless they have a <code>.lib</code>.)</li> </ul> The main thing to retain is that if you want implicit loading, you have to provide the linker with the appropriate information, either with a <code>.lib</code> or a <code>.so</code> file, so that it can insert that information into the executable. And that if you want explicit loading, you can't refer to any of the symbols in the library directly; you have to call <code>GetProcAddress</code>/<code>dlsym</code> to get their addresses yourself (and do some funny casting to use them).

<blockquote> How does dynamic linking work generally? </blockquote> The dynamic link library (aka shared object) file contains machine code instructions and data, along with a table of metadata saying which offsets in that code/data relate to which "symbols", the type of the symbol (e.g. function vs data), the number of bytes or words in the data, and a few other things. Different OS will tend to have different shared object file formats, and indeed the same OS may support several, but that's the gist of it. So, imagine the shared library's a big chunk of bytes with an index like this: <pre class="prettyprint"><code>SYMBOL ADDRESS TYPE SIZE my_function 1000 function 2893 my_number 4800 variable 4 </code></pre> In general, the exact type of the symbols need not be captured in the metadata table - it's expected that declarations in the library's header files contain all the missing information. C++ is a bit special - compared to say C - because overloading can mean there are several functions with the same name, and namespaces allow for further symbols that would otherwise be ambiguously named - for that reason name mangling is typically used to concatenate some representation of the namespace and function arguments to the function name, forming something that can be unique in the library object file. A program wanting to use the shared object can generally do one of two things: <ul> <li>have the OS load both itself and the shared object around the same time (before executing <code>main()</code>), with the OS Loader responsible for finding the symbols and examining metadata in the program file image about the use of those symbols, then patching in symbol addresses in the memory the program uses, such that the program can then just run and work functionally as if it'd known about the symbol addresses when it was first compiled (but perhaps a little slower)</li> <li>or, explicitly in its own source code call <code>dlopen</code> sometime after <code>main</code> runs, then use <code>dlsym</code> or similar to get the symbol addresses, save them into (function/data) pointers based on the programmer's knowledge of the expected data types, then call them explicitly using the pointers.</li> </ul> <blockquote> On Windows (LoadLibrary), you need a .dll to call at runtime, but at link time, you need to provide a corresponding .lib file or the program won't link... </blockquote> That doesn't sound right. Should be one or the other I'd think. <blockquote> Wtf does the .lib file contain? A description of the .dll methods? Isn't that what the headers contain? </blockquote> A lib file is - at this level of description - pretty much the same as a shared object file... the main difference is that the compiler's finding the symbol addresses before the program's shipped and run.

C/C++ How Does Dynamic Linking Work On Different Platforms?

Tags:

c++

c

compilation

dynamic-linking

loadlibrary

How does dynamic linking work generally?

On Windows (LoadLibrary), you need a .dll to call at runtime, but at link time, you need to provide a corresponding .lib file or the program won't link... What does the .lib file contain? A description of the .dll methods? Isn't that what the headers contain?

Relatedly, on *nix, you don't need a lib file... How how does the compiler know that the methods described in the header will be available at runtime?

As a newbie, when you think about either one of the two schemes, then the other, neither of them make sense...

457

asked Apr 04 '14 10:04

Charlie

3 Answers

To answer your questions one by one:

Dynamic linking defers part of the linking process to runtime. It can be used in two ways: implicitly and explicitly. Implicitly, the static linker will insert information into the executable which will cause the library to load and resolve the necessary symbols. Explicitly, you must call LoadLibrary or dlopen manually, and then GetProcAddress/dlsym for each symbol you need to use. Implicit loading is used for things like the system library, where the implementation will depend on the version of the system, but the interface is guaranteed. Explicit loading is used for things like plug-ins, where the library to be loaded will be determined at runtime.
The .lib file is only necessary for implicit loading. It contains the information that the library actually provides this symbol, so the linker won't complain that the symbol is undefined, and it tells the linker in what library the symbols are located, so it can insert the necessary information to cause this library to automatically be loaded. All the header files tell the compiler is that the symbols will exist, somewhere; the linker needs the .lib to know where.
Under Unix, all of the information is extracted from the .so. Why Windows requires two separate files, rather than putting all of the information in one file, I don't know; it's actually duplicating most of the information, since the information needed in the .lib is also needed in the .dll. (Perhaps licensing issues. You can distribute your program with the .dll, but no one can link against the libraries unless they have a .lib.)

The main thing to retain is that if you want implicit loading, you have to provide the linker with the appropriate information, either with a .lib or a .so file, so that it can insert that information into the executable. And that if you want explicit loading, you can't refer to any of the symbols in the library directly; you have to call GetProcAddress/dlsym to get their addresses yourself (and do some funny casting to use them).

197

answered Oct 11 '22 01:10

James Kanze

The .lib file on Windows is not required for loading a dynamic library, it merely offers a convenient way of doing so.

In principle, you can use LoadLibrary for loading the dll and then use GetProcAddress for accessing functions provided by that dll. The compilation of the enclosing program does not need to access the dll in that case, it is only needed at runtime (ie. when LoadLibrary actually executes). MSDN has a code example.

The disadvantage here is that you need to manually write code for loading the functions from the dll. In case you compiled the dll yourself in the first place, this code simply duplicates knowledge that the compiler could have extracted from the dll source code automatically (like the names and signatures of exported functions).

This is what the .lib file does: It contains the GetProcAddress calls for the Dlls exported functions, generated by the compiler so you don't have to worry about it. In Windows terms, this is called Load-Time Dynamic Linking, since the Dll is loaded automatically by the code from the .lib file when your enclosing program is loaded (as opposed to the manual approach, referred to as run-time dynamic linking).

answered Oct 11 '22 01:10

ComicSansMS

How does dynamic linking work generally?

The dynamic link library (aka shared object) file contains machine code instructions and data, along with a table of metadata saying which offsets in that code/data relate to which "symbols", the type of the symbol (e.g. function vs data), the number of bytes or words in the data, and a few other things. Different OS will tend to have different shared object file formats, and indeed the same OS may support several, but that's the gist of it.

So, imagine the shared library's a big chunk of bytes with an index like this:

SYMBOL       ADDRESS        TYPE        SIZE
my_function  1000           function    2893
my_number    4800           variable    4

In general, the exact type of the symbols need not be captured in the metadata table - it's expected that declarations in the library's header files contain all the missing information. C++ is a bit special - compared to say C - because overloading can mean there are several functions with the same name, and namespaces allow for further symbols that would otherwise be ambiguously named - for that reason name mangling is typically used to concatenate some representation of the namespace and function arguments to the function name, forming something that can be unique in the library object file.

A program wanting to use the shared object can generally do one of two things:

have the OS load both itself and the shared object around the same time (before executing main()), with the OS Loader responsible for finding the symbols and examining metadata in the program file image about the use of those symbols, then patching in symbol addresses in the memory the program uses, such that the program can then just run and work functionally as if it'd known about the symbol addresses when it was first compiled (but perhaps a little slower)
or, explicitly in its own source code call dlopen sometime after main runs, then use dlsym or similar to get the symbol addresses, save them into (function/data) pointers based on the programmer's knowledge of the expected data types, then call them explicitly using the pointers.

On Windows (LoadLibrary), you need a .dll to call at runtime, but at link time, you need to provide a corresponding .lib file or the program won't link...

That doesn't sound right. Should be one or the other I'd think.

Wtf does the .lib file contain? A description of the .dll methods? Isn't that what the headers contain?

A lib file is - at this level of description - pretty much the same as a shared object file... the main difference is that the compiler's finding the symbol addresses before the program's shipped and run.

answered Oct 11 '22 02:10

Tony Delroy

Related questions
                            
                                how to return numpy.array from boost::python?
                            
                                How can I make a QString html-escaped
                            
                                Sum values of 2 vectors [duplicate]
                            
                                Cannot cast "derived" to its private base class "base" [duplicate]
                            
                                How can unique_ptr have no overhead if it needs to store the deleter?
                            
                                Linking C++ code with 'gcc' (without g++)
                            
                                How do I load a shared object in C++?
                            
                                Destructors for C++ Interface-like classes
                            
                                Overhead of creating a new class
                            
                                How to obtain (almost) unique system identifier in a cross platform way?
                            
                                Why do C and C++ compilers place explicitly initialized and default initialized global variables in different segments?
                            
                                What is the use of "delete this"?
                            
                                Why is including "using namespace" into a header file a bad idea in C++?
                            
                                Appending a new line in a file(log file) in c++
                            
                                0xC0000005: Access violation reading location 0x00000000
                            
                                Range-based for loop on a dynamic array?
                            
                                Why can function pointers be `constexpr`?
                            
                                Scope and return values in C++
                            
                                Which color gradient is used to color mandelbrot in wikipedia?
                            
                                "Symbol(s) not found for architecture x86_64" on QtCreator project

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With