Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Static variable is initialized twice

Consider I have a static variable in a compilation unit which ends up in a static library libA. I then have another compilation unit accessing this variable which ends up in a shared library libB.so (so libA must be linked into libB). Finally I have a main function also accessing the static variable from A directly and having a dependency to libB (so I link against libA and libB).

I then observe, that the static variable is initialized twice, i.e. its constructor is run twice! This doesn't seem to be right. Shouldn't the linker recognize both variables to be the same and optimize them as one?

To make my confusion perfect, I see it is run twice with the same address! So maybe the linker did recognize it, but did not remove the second call in the static_initialization_and_destruction code?

Here's a showcase:

ClassA.hpp:

#ifndef CLASSA_HPP
#define CLASSA_HPP

class ClassA
{
public:
    ClassA();
    ~ClassA();
    static ClassA staticA;

    void test();
};

#endif // CLASSA_HPP

ClassA.cpp:

#include <cstdio>
#include "ClassA.hpp"

ClassA ClassA::staticA;

ClassA::ClassA()
{
    printf("ClassA::ClassA() this=%p\n", this);
}

ClassA::~ClassA()
{
    printf("ClassA::~ClassA() this=%p\n", this);
}

void ClassA::test()
{
    printf("ClassA::test() this=%p\n", this);
}

ClassB.hpp:

#ifndef CLASSB_HPP
#define CLASSB_HPP

class ClassB
{
public:
    ClassB();
    ~ClassB();

    void test();
};

#endif // CLASSB_HPP

ClassB.cpp:

 #include <cstdio>
 #include "ClassA.hpp"
 #include "ClassB.hpp"

 ClassB::ClassB()
 {
     printf("ClassB::ClassB() this=%p\n", this);
 }

 ClassB::~ClassB()
 {
     printf("ClassB::~ClassB() this=%p\n", this);
 }

 void ClassB::test()
 {
     printf("ClassB::test() this=%p\n", this);
     printf("ClassB::test: call staticA.test()\n");
     ClassA::staticA.test();
 }

Test.cpp:

#include <cstdio>
#include "ClassA.hpp"
#include "ClassB.hpp"

int main(int argc, char * argv[])
{
    printf("main()\n");
    ClassA::staticA.test();
    ClassB b;
    b.test();
    printf("main: END\n");

    return 0;
}

I then compile and link as follows:

g++ -c ClassA.cpp
ar rvs libA.a ClassA.o
g++ -c ClassB.cpp
g++ -shared -o libB.so ClassB.o libA.a
g++ -c Test.cpp
g++ -o test Test.cpp libA.a libB.so

Output is:

ClassA::ClassA() this=0x804a040
ClassA::ClassA() this=0x804a040
main()
ClassA::test() this=0x804a040
ClassB::ClassB() this=0xbfcb064f
ClassB::test() this=0xbfcb064f
ClassB::test: call staticA.test()
ClassA::test() this=0x804a040
main: END
ClassB::~ClassB() this=0xbfcb064f
ClassA::~ClassA() this=0x804a040
ClassA::~ClassA() this=0x804a040

Can somebody please explain what is going on here? What is the linker doing? How can the same variable be initialized twice?

like image 811
bselu Avatar asked Oct 24 '14 12:10

bselu


2 Answers

You are including libA.a into libB.so. By doing this, both libB.so and libA.a contain ClassA.o, which defines the static member.

In the link order you specified, the linker pulls in ClassA.o from the static library libA.a, so ClassA.o initialization code is run before main(). When the first function in the dynamic libB.so is accessed, all initializers for libB.so are run. Since libB.so includes ClassA.o, ClassA.o's static initializer must be run (again).

Possible fixes:

  1. Don't put ClassA.o into both libA.a and libB.so.

    g++ -shared -o libB.so ClassB.o
    
  2. Don't use both libraries; libA.a is not needed.

    g++ -o test Test.cpp libB.so
    

Applying either of the above fixes the problem:

ClassA::ClassA() this=0x600e58
main()
ClassA::test() this=0x600e58
ClassB::ClassB() this=0x7fff1a69f0cf
ClassB::test() this=0x7fff1a69f0cf
ClassB::test: call staticA.test()
ClassA::test() this=0x600e58
main: END
ClassB::~ClassB() this=0x7fff1a69f0cf
ClassA::~ClassA() this=0x600e58
like image 166
Jay West Avatar answered Oct 23 '22 15:10

Jay West


Can somebody please explain what is going on here?

It's complicated.

First, the way that you linked your main executable and the shared library causes two instances of staticA (and all the other code from ClassA.cpp) to be present: one in the main executable, and another in libB.so.

You can confirm this by running

nm -AD ./test ./libB.so | grep staticA

It is then not very surprising that the ClassA constructor for the two instances runs two times, but it is still surprising that the this pointer is the same (and corresponds to staticA in the main executable).

That is happening because the runtime loader (unsuccessfully) tries to emulate the behavior of linking with archive libraries, and binds all references to staticA to the first globally-exported instance it observes (the one in test).

So what can you do to fix this? That depends on what staticA actually represents.

If it is some kind of singleton, that should only exist once in any program, then the easy solution is make it so that there is only a single instance of staticA. And a way to do that is to require that any program that uses libB.so also links against libA.a, and not link libB.so against libA.a. That will eliminate the instance of sttaicA inside libB.so. You've claimed that "libA must be linked into libB", but that claim is false.

Alternatively, if you build libA.so instead of libA.a, then you can link libB.so against libA.so (so libB.so is self-contained). If the main application also links against libA.so, that wouldn't be a problem: there will only be one instance of staticA inside libA.so, not matter how many times that library is used.

On the other hand, if staticA represents some kind of internal implementation detail, and you are ok with having two instances of it (so long as they don't interfere with each other), then the solution is to mark all of ClassA symbols with hidden visibility, as this answer suggests.

Update:

why the linker does not eliminate the second instance of staticA from the executable.

Because the linker does what you told it to do. If you change your link command line to:

g++ -o test Test.cpp libB.so libA.a

then the linker should not link ClassA into the main executable. To understand why the order of libraries on command line matters, read this.

like image 21
Employed Russian Avatar answered Oct 23 '22 15:10

Employed Russian