Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Same symbols in different libraries and linking order

I have 2 libraries: test.1 and test.2. Both libraries contain a single global extern "C" void f(); function, with different implementations (just a cout for the test).

I did the following test:

Test 1 Dynamic linking:
If I add libtest.1.so and then libtest.2.so in the makefile of the executable and then call f(); in main, libtest.1.so->f() is called.
If I change the order in the makefile, libtest.2.so->f() is called

Test 2 Static linking:
Absolutely the same happens with static libraries

Test 3 Dynamic loading
As the library is manually loaded, everything works as expected.


I expected an error for multiple definitions, which obviously didn't happen.

Also, this does not break the one-definition-rule, as the situation is different.

It's also not a dependency-hell(not that it's related to this at all), nor any linking fiasco..

So, than what is this? Undefined behavior? Unspecified behavior? Or it really does depend on the linking order?

And is there a way to easily detect such situations?


Related questions:
dlopen vs linking overhead
What is the difference between dynamic linking and dynamic loading
Is there a downside to using -Bsymbolic-functions?
Why does the order in which libraries are linked sometimes cause errors in GCC?
linking two shared libraries with some of the same symbols


EDIT I did two more tests, which confirm this UB:

I added a second function void g() in test.1 and NOT in test.2.

Using dynamic linking and .so libs, the same happens - f is called with the same manner, g is also executable (as expected).

But using static linking now changes the things: if test.1 is before test.2, there are no errors, both functions from test.1 are called.
But when the order is changed, "multiple definitions" error occurs.

It's clear, that "no diagnostic required" (see @MarkB's answer), but it's "strange" that sometimes the error occurs, sometimes - it doesn't.

Anyway, the answer is pretty clear and explains everything above - UB.

like image 947
Kiril Kirov Avatar asked Mar 16 '15 14:03

Kiril Kirov


People also ask

Does order of linking libraries matter?

When linking object files (static libraries) into an executable, the order in which you give the libraries matters. For simple scenarios where there are no cyclic references, the dependent library should come on the left, and the library which provides said dependency should come on the right.

What are linker symbols?

Linker symbols have a name and a value. The value is a 32-bit unsigned integer, even if it represents a pointer value on a target that has pointers smaller than 32 bits. The most common kind of symbol is generated by the compiler for each function and variable.

Why does it matter what order we list libraries on the command line?

It will just fail the link with undefined symbols if you had the library order wrong.

Do static libraries have symbols?

A .o file inside a library might contain symbols (functions, variables etc.) that are not used by your program. At link time, a static library can have unresolved symbols in it, as long as you don't need the unresolved symbols, and you don't need any symbol that is in a .o file that contains an unresolved symbol.


2 Answers

A library is a collection of object files. The linker extracts objects from libraries as necessary to satisfy unresolved symbols. What is important, the linker inspects libraries in the order they appear on a command line, looks into each library just once (unless the command line mentions the library more than once), and takes only objects which satisfy some reference.

In your first set of tests, everything is clear: the linker satisfies a reference to f() from the first available library, and that's pretty much it.

Now the second set of tests. In the success case test.1 satisfies both f and g references, so test.2 is irrelevant. In the failure case, test.2 satisfies the f reference, but g remains undefined. To satisfy g, linker must pull some object from test.1, which also happen to supply f. Obviously it is multiple definition.

Notice that in order to have an error you must have f and g in the same object. If test.1 is composed of 2 objects (one defining f and another defining g) the error disappears.

like image 117
user58697 Avatar answered Oct 04 '22 04:10

user58697


This absolutely violates the one definition rule in cases 1&2. In case 3, since you explicitly specify which version of the function to execute it may or may not. Violating the ODR is undefined behavior, no diagnostic required.

3.2/3:

Every program shall contain exactly one definition of every non-inline function or variable that is odr-used in that program; no diagnostic required.

like image 36
Mark B Avatar answered Oct 04 '22 02:10

Mark B