Consider I have a static variable in a compilation unit which ends up in a static library libA. I then have another compilation unit accessing this variable which ends up in a shared library libB.so (so libA must be linked into libB). Finally I have a main function also accessing the static variable from A directly and having a dependency to libB (so I link against libA and libB).
I then observe, that the static variable is initialized twice, i.e. its constructor is run twice! This doesn't seem to be right. Shouldn't the linker recognize both variables to be the same and optimize them as one?
To make my confusion perfect, I see it is run twice with the same address! So maybe the linker did recognize it, but did not remove the second call in the static_initialization_and_destruction code?
Here's a showcase:
ClassA.hpp:
#ifndef CLASSA_HPP
#define CLASSA_HPP
class ClassA
{
public:
ClassA();
~ClassA();
static ClassA staticA;
void test();
};
#endif // CLASSA_HPP
ClassA.cpp:
#include <cstdio>
#include "ClassA.hpp"
ClassA ClassA::staticA;
ClassA::ClassA()
{
printf("ClassA::ClassA() this=%p\n", this);
}
ClassA::~ClassA()
{
printf("ClassA::~ClassA() this=%p\n", this);
}
void ClassA::test()
{
printf("ClassA::test() this=%p\n", this);
}
ClassB.hpp:
#ifndef CLASSB_HPP
#define CLASSB_HPP
class ClassB
{
public:
ClassB();
~ClassB();
void test();
};
#endif // CLASSB_HPP
ClassB.cpp:
#include <cstdio>
#include "ClassA.hpp"
#include "ClassB.hpp"
ClassB::ClassB()
{
printf("ClassB::ClassB() this=%p\n", this);
}
ClassB::~ClassB()
{
printf("ClassB::~ClassB() this=%p\n", this);
}
void ClassB::test()
{
printf("ClassB::test() this=%p\n", this);
printf("ClassB::test: call staticA.test()\n");
ClassA::staticA.test();
}
Test.cpp:
#include <cstdio>
#include "ClassA.hpp"
#include "ClassB.hpp"
int main(int argc, char * argv[])
{
printf("main()\n");
ClassA::staticA.test();
ClassB b;
b.test();
printf("main: END\n");
return 0;
}
I then compile and link as follows:
g++ -c ClassA.cpp
ar rvs libA.a ClassA.o
g++ -c ClassB.cpp
g++ -shared -o libB.so ClassB.o libA.a
g++ -c Test.cpp
g++ -o test Test.cpp libA.a libB.so
Output is:
ClassA::ClassA() this=0x804a040
ClassA::ClassA() this=0x804a040
main()
ClassA::test() this=0x804a040
ClassB::ClassB() this=0xbfcb064f
ClassB::test() this=0xbfcb064f
ClassB::test: call staticA.test()
ClassA::test() this=0x804a040
main: END
ClassB::~ClassB() this=0xbfcb064f
ClassA::~ClassA() this=0x804a040
ClassA::~ClassA() this=0x804a040
Can somebody please explain what is going on here? What is the linker doing? How can the same variable be initialized twice?
You are including libA.a
into libB.so
. By doing this, both libB.so
and libA.a
contain ClassA.o
, which defines the static member.
In the link order you specified, the linker pulls in ClassA.o
from the static library libA.a
, so ClassA.o
initialization code is run before main()
. When the first function in the dynamic libB.so
is accessed, all initializers for libB.so
are run. Since libB.so
includes ClassA.o
, ClassA.o
's static initializer must be run (again).
Possible fixes:
Don't put ClassA.o into both libA.a and libB.so.
g++ -shared -o libB.so ClassB.o
Don't use both libraries; libA.a is not needed.
g++ -o test Test.cpp libB.so
Applying either of the above fixes the problem:
ClassA::ClassA() this=0x600e58
main()
ClassA::test() this=0x600e58
ClassB::ClassB() this=0x7fff1a69f0cf
ClassB::test() this=0x7fff1a69f0cf
ClassB::test: call staticA.test()
ClassA::test() this=0x600e58
main: END
ClassB::~ClassB() this=0x7fff1a69f0cf
ClassA::~ClassA() this=0x600e58
Can somebody please explain what is going on here?
It's complicated.
First, the way that you linked your main executable and the shared library causes two instances of staticA
(and all the other code from ClassA.cpp
) to be present: one in the main executable, and another in libB.so
.
You can confirm this by running
nm -AD ./test ./libB.so | grep staticA
It is then not very surprising that the ClassA
constructor for the two instances runs two times, but it is still surprising that the this
pointer is the same (and corresponds to staticA
in the main executable).
That is happening because the runtime loader (unsuccessfully) tries to emulate the behavior of linking with archive libraries, and binds all references to staticA
to the first globally-exported instance it observes (the one in test
).
So what can you do to fix this? That depends on what staticA
actually represents.
If it is some kind of singleton, that should only exist once in any program, then the easy solution is make it so that there is only a single instance of staticA
. And a way to do that is to require that any program that uses libB.so
also links against libA.a
, and not link libB.so
against libA.a
. That will eliminate the instance of sttaicA
inside libB.so
. You've claimed that "libA must be linked into libB", but that claim is false.
Alternatively, if you build libA.so
instead of libA.a
, then you can link libB.so
against libA.so
(so libB.so
is self-contained). If the main application also links against libA.so
, that wouldn't be a problem: there will only be one instance of staticA
inside libA.so
, not matter how many times that library is used.
On the other hand, if staticA
represents some kind of internal implementation detail, and you are ok with having two instances of it (so long as they don't interfere with each other), then the solution is to mark all of ClassA
symbols with hidden visibility, as this answer suggests.
Update:
why the linker does not eliminate the second instance of staticA from the executable.
Because the linker does what you told it to do. If you change your link command line to:
g++ -o test Test.cpp libB.so libA.a
then the linker should not link ClassA
into the main executable. To understand why the order of libraries on command line matters, read this.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With