Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there any preference linker gives to static symbols or dynamic symbols?

I have two headers and two cpp files:

//f1.h
int f1();

//f1.cpp
include "f1.h"
int f1() {return 1;}

//f2.h
int f2();

//f2.cpp
#include "f2.h"
#include "f1.h"
int f2() {return f1() + 1;}

//main.cpp
#include "f2.h"
int main() {return f2();}

First I compile a shared object from f1 and f2 and create a binary from main.cpp depending on that shared object:

g++ -c -fPIC -shared f1.cpp f2.cpp
g++ -shared -fPIC -o libf.so f2.o f1.o
g++ -o dynamic main.cpp libf.so

Now I introduce some changes to f1.cpp (say f1 now returns 2):

//f1.cpp#
include "f1.h"
int f1() {return 2;}

And compile a binary as follows:

g++ -o semistatic main.cpp f1.cpp libf.so

The question is whether 'semistatic' binary will use definition of f1() from libf (in which f1 returns 1) or it will use statically linked symbol (one in which f1 returns 2)? Is this different across systems and can I rely on this being consistent within a single system?

like image 205
senx Avatar asked Jun 28 '18 06:06

senx


People also ask

What is visibility symbol?

What is symbol and symbol visibility. Symbol is one of the basic terms when talking about object files, linking, and so on. In fact, in C/C++ language, symbol is the corresponding entity of most user-defined variables, function names, mangled with namespace, class/struct/name, and so on.

What is a static symbol?

The Static Symbols category contains graphical symbols meant to help you indicate, mark and emphasize elements in your design as well as to build expressive schematics and plans.

What is -- whole archive?

--whole-archive. For each archive mentioned on the command line after the --whole-archive option, include every object file in the archive in the link, rather than searching the archive for the required object files.

What is the advantage of dynamic linking over static linking?

In terms of both physical memory and disk-space usage, it is much more efficient to load the system libraries into memory only once. Dynamic linking allows this single loading to happen. Every dynamically linked program contains a small, statically linked function that is called when the program starts.

What is the difference between link library and static function?

This static function only maps the link library into memory and runs the code that the function contains. The link library determines what are all the dynamic libraries which the program requires along with the names of the variables and functions needed from those libraries by reading the information contained in sections of the library.

What is dynamic linking in C++?

Dynamic linking allows this single loading to happen. Every dynamically linked program contains a small, statically linked function that is called when the program starts. This static function only maps the link library into memory and runs the code that the function contains.

Why library references are more efficient than static references?

Library references are more efficient because the library procedures are statically linked into the program. Static linking increases the file size of your program, and it may increase the code size in memory if other applications, or other copies of your application, are running on the system.


1 Answers

As have been pointed out, you are violating the one-definition rule. This is not the end of the world, but in this case there are no guarantees from the C++-standard what will happens and the behavior depends on the implementation details of the linker and loader.

Tool-chains and operating systems are quite different so the above will not even link on Windows. But if your are speaking about Linux with the usual linker/loader pair, then the behavior will be to use the changed version - and it will be the for every Linux-installation.

That is the way the linker/loader are working on Linux (and this behavior is widely used for example for LD_PRELOAD-trick):

  • The symbols in *.so are weak and so the definition from *.so are just neglected if linker finds another definition somewhere else (in your case in the updated version of f1.o).
  • during the run time, the loader neglects the definitions from the shared object, if the symbol is already bound, i.e. another definition is known. In your case the symbol f1 (ok, because of the name-mangling it will have a different name, but let's ignore that for the sake of simplicity) is already bound to the definition which is in the main-program and thus will be used when f1 is called in *.so.

However, this way of doing things is very brittle and some minor changes can lead to a different result.

A: changing the visibility to hidden.

It's recommended to hide symbols which are not part of the public interface, i.e.

__attribute__ ((visibility ("hidden")))
int f1() {return 1;}

In this case, not the overwritten version is used but the old. The difference is, that when the linker sees a hidden symbol being used, it no longer delegates it to the loader to resolve the address of the symbol, but uses the address at hand directly. Later on, there is no way we could change which definition is called.

B: making f1 were an inline-function.

That would lead to really funny things, because in some parts the shared-object the old version would be used and in some part the new version.

-fPIC prevents the inlining of the function which are not marked with inline, so the above holds only for function which are marked as inline explicitly.


In a nutshell: This trick is can be used on Linux. However in bigger projects you don't want to have additional complexity and try to stick the more sustainable and simple one-definition-rule framework.

like image 159
ead Avatar answered Nov 15 '22 12:11

ead