Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C/C++ How Does Dynamic Linking Work On Different Platforms?

How does dynamic linking work generally?

On Windows (LoadLibrary), you need a .dll to call at runtime, but at link time, you need to provide a corresponding .lib file or the program won't link... What does the .lib file contain? A description of the .dll methods? Isn't that what the headers contain?

Relatedly, on *nix, you don't need a lib file... How how does the compiler know that the methods described in the header will be available at runtime?

As a newbie, when you think about either one of the two schemes, then the other, neither of them make sense...

like image 457
Charlie Avatar asked Apr 04 '14 10:04

Charlie


People also ask

How does the dynamic link works?

Dynamic Links are smart URLs that allow you to send existing and potential users to any location within your iOS or Android app. They survive the app install process, so even new users see the content they're looking for when they open the app for the first time. Dynamic Links are no-cost forever , for any scale.

How does dynamic linking work on Linux?

Dynamic linking is the most common method, especially on Linux systems. Dynamic linking keeps libraries modular, so just one library can be shared between any number of applications. Modularity also allows a shared library to be updated independently of the applications that rely upon it.

How dynamic linking is different from static linking in C?

The main difference between static and dynamic linking is that static linking copies all library modules used in the program into the final executable file at the final step of the compilation while, in dynamic linking, the linking occurs at run time when both executable files and libraries are placed in the memory.

What is dynamic linker in C?

The dynamic linker is the program that manages shared dynamic libraries on behalf of an executable. It works to load libraries into memory and modify the program at runtime to call the functions in the library.


3 Answers

To answer your questions one by one:

  • Dynamic linking defers part of the linking process to runtime. It can be used in two ways: implicitly and explicitly. Implicitly, the static linker will insert information into the executable which will cause the library to load and resolve the necessary symbols. Explicitly, you must call LoadLibrary or dlopen manually, and then GetProcAddress/dlsym for each symbol you need to use. Implicit loading is used for things like the system library, where the implementation will depend on the version of the system, but the interface is guaranteed. Explicit loading is used for things like plug-ins, where the library to be loaded will be determined at runtime.

  • The .lib file is only necessary for implicit loading. It contains the information that the library actually provides this symbol, so the linker won't complain that the symbol is undefined, and it tells the linker in what library the symbols are located, so it can insert the necessary information to cause this library to automatically be loaded. All the header files tell the compiler is that the symbols will exist, somewhere; the linker needs the .lib to know where.

  • Under Unix, all of the information is extracted from the .so. Why Windows requires two separate files, rather than putting all of the information in one file, I don't know; it's actually duplicating most of the information, since the information needed in the .lib is also needed in the .dll. (Perhaps licensing issues. You can distribute your program with the .dll, but no one can link against the libraries unless they have a .lib.)

The main thing to retain is that if you want implicit loading, you have to provide the linker with the appropriate information, either with a .lib or a .so file, so that it can insert that information into the executable. And that if you want explicit loading, you can't refer to any of the symbols in the library directly; you have to call GetProcAddress/dlsym to get their addresses yourself (and do some funny casting to use them).

like image 197
James Kanze Avatar answered Oct 11 '22 01:10

James Kanze


The .lib file on Windows is not required for loading a dynamic library, it merely offers a convenient way of doing so.

In principle, you can use LoadLibrary for loading the dll and then use GetProcAddress for accessing functions provided by that dll. The compilation of the enclosing program does not need to access the dll in that case, it is only needed at runtime (ie. when LoadLibrary actually executes). MSDN has a code example.

The disadvantage here is that you need to manually write code for loading the functions from the dll. In case you compiled the dll yourself in the first place, this code simply duplicates knowledge that the compiler could have extracted from the dll source code automatically (like the names and signatures of exported functions).

This is what the .lib file does: It contains the GetProcAddress calls for the Dlls exported functions, generated by the compiler so you don't have to worry about it. In Windows terms, this is called Load-Time Dynamic Linking, since the Dll is loaded automatically by the code from the .lib file when your enclosing program is loaded (as opposed to the manual approach, referred to as run-time dynamic linking).

like image 35
ComicSansMS Avatar answered Oct 11 '22 01:10

ComicSansMS


How does dynamic linking work generally?

The dynamic link library (aka shared object) file contains machine code instructions and data, along with a table of metadata saying which offsets in that code/data relate to which "symbols", the type of the symbol (e.g. function vs data), the number of bytes or words in the data, and a few other things. Different OS will tend to have different shared object file formats, and indeed the same OS may support several, but that's the gist of it.

So, imagine the shared library's a big chunk of bytes with an index like this:

SYMBOL       ADDRESS        TYPE        SIZE
my_function  1000           function    2893
my_number    4800           variable    4

In general, the exact type of the symbols need not be captured in the metadata table - it's expected that declarations in the library's header files contain all the missing information. C++ is a bit special - compared to say C - because overloading can mean there are several functions with the same name, and namespaces allow for further symbols that would otherwise be ambiguously named - for that reason name mangling is typically used to concatenate some representation of the namespace and function arguments to the function name, forming something that can be unique in the library object file.

A program wanting to use the shared object can generally do one of two things:

  • have the OS load both itself and the shared object around the same time (before executing main()), with the OS Loader responsible for finding the symbols and examining metadata in the program file image about the use of those symbols, then patching in symbol addresses in the memory the program uses, such that the program can then just run and work functionally as if it'd known about the symbol addresses when it was first compiled (but perhaps a little slower)

  • or, explicitly in its own source code call dlopen sometime after main runs, then use dlsym or similar to get the symbol addresses, save them into (function/data) pointers based on the programmer's knowledge of the expected data types, then call them explicitly using the pointers.

On Windows (LoadLibrary), you need a .dll to call at runtime, but at link time, you need to provide a corresponding .lib file or the program won't link...

That doesn't sound right. Should be one or the other I'd think.

Wtf does the .lib file contain? A description of the .dll methods? Isn't that what the headers contain?

A lib file is - at this level of description - pretty much the same as a shared object file... the main difference is that the compiler's finding the symbol addresses before the program's shipped and run.

like image 9
Tony Delroy Avatar answered Oct 11 '22 02:10

Tony Delroy