I have two headers and two cpp files:
//f1.h
int f1();
//f1.cpp
include "f1.h"
int f1() {return 1;}
//f2.h
int f2();
//f2.cpp
#include "f2.h"
#include "f1.h"
int f2() {return f1() + 1;}
//main.cpp
#include "f2.h"
int main() {return f2();}
First I compile a shared object from f1
and f2
and create a binary from main.cpp
depending on that shared object:
g++ -c -fPIC -shared f1.cpp f2.cpp
g++ -shared -fPIC -o libf.so f2.o f1.o
g++ -o dynamic main.cpp libf.so
Now I introduce some changes to f1.cpp
(say f1
now returns 2
):
//f1.cpp#
include "f1.h"
int f1() {return 2;}
And compile a binary as follows:
g++ -o semistatic main.cpp f1.cpp libf.so
The question is whether 'semistatic' binary will use definition of f1()
from libf
(in which f1
returns 1
) or it will use statically linked symbol (one in which f1
returns 2
)? Is this different across systems and can I rely on this being consistent within a single system?
What is symbol and symbol visibility. Symbol is one of the basic terms when talking about object files, linking, and so on. In fact, in C/C++ language, symbol is the corresponding entity of most user-defined variables, function names, mangled with namespace, class/struct/name, and so on.
The Static Symbols category contains graphical symbols meant to help you indicate, mark and emphasize elements in your design as well as to build expressive schematics and plans.
--whole-archive. For each archive mentioned on the command line after the --whole-archive option, include every object file in the archive in the link, rather than searching the archive for the required object files.
In terms of both physical memory and disk-space usage, it is much more efficient to load the system libraries into memory only once. Dynamic linking allows this single loading to happen. Every dynamically linked program contains a small, statically linked function that is called when the program starts.
This static function only maps the link library into memory and runs the code that the function contains. The link library determines what are all the dynamic libraries which the program requires along with the names of the variables and functions needed from those libraries by reading the information contained in sections of the library.
Dynamic linking allows this single loading to happen. Every dynamically linked program contains a small, statically linked function that is called when the program starts. This static function only maps the link library into memory and runs the code that the function contains.
Library references are more efficient because the library procedures are statically linked into the program. Static linking increases the file size of your program, and it may increase the code size in memory if other applications, or other copies of your application, are running on the system.
As have been pointed out, you are violating the one-definition rule. This is not the end of the world, but in this case there are no guarantees from the C++-standard what will happens and the behavior depends on the implementation details of the linker and loader.
Tool-chains and operating systems are quite different so the above will not even link on Windows. But if your are speaking about Linux with the usual linker/loader pair, then the behavior will be to use the changed version - and it will be the for every Linux-installation.
That is the way the linker/loader are working on Linux (and this behavior is widely used for example for LD_PRELOAD-trick):
*.so
are weak and so the definition from *.so
are just neglected if linker finds another definition somewhere else (in your case in the updated version of f1.o
).f1
(ok, because of the name-mangling it will have a different name, but let's ignore that for the sake of simplicity) is already bound to the definition which is in the main-program and thus will be used when f1
is called in *.so
.However, this way of doing things is very brittle and some minor changes can lead to a different result.
A: changing the visibility to hidden.
It's recommended to hide symbols which are not part of the public interface, i.e.
__attribute__ ((visibility ("hidden")))
int f1() {return 1;}
In this case, not the overwritten version is used but the old. The difference is, that when the linker sees a hidden symbol being used, it no longer delegates it to the loader to resolve the address of the symbol, but uses the address at hand directly. Later on, there is no way we could change which definition is called.
B: making f1
were an inline-function.
That would lead to really funny things, because in some parts the shared-object the old version would be used and in some part the new version.
-fPIC
prevents the inlining of the function which are not marked with inline
, so the above holds only for function which are marked as inline explicitly.
In a nutshell: This trick is can be used on Linux. However in bigger projects you don't want to have additional complexity and try to stick the more sustainable and simple one-definition-rule framework.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With