Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

When should linkers generate multiply defined X warnings?

Tags:

c++

linker

Never turn your back on C++. It'll getcha.

I'm in the habit of writing unit tests for everything I do. As part of this I frequently define classes with names like A and B, in the .cxx of the test to exercise code, safe in the knowledge that i) because this code never becomes part of a library or is used outside of the test, name collisions are likely very rate and ii) the worst that could happen is that the linker will complain about multiply defined A::A() or what every and I'll fix that error. How wrong I was.

Here are two compilation units:

#include <iostream>
using namespace std;

// Fwd decl.
void runSecondUnit();

class A {
public:
   A() : version( 1 ) {
      cerr << this << "   A::A()  --- 1\n";
   }    
   virtual ~A()   {
      cerr << this << "   A::~A()  --- 1\n";
   }

   int version;    };

void runFirstUnit()  {
   A a;
   // Reports 1, correctly.
   cerr << "   a.version = " << a.version << endl;
   // If you uncomment these, you will call
   // secondCompileUnit: A::getName() instead of A::~A !
   //A* a2 = new A;
   //delete a2;
}

int main( int argc, char** argv )  {
   cerr << "firstUnit BEGIN\n";
   runFirstUnit();
   cerr << "firstUnit END\n";

   cerr << "secondUnit BEGIN\n";
   runSecondUnit();
   cerr << "secondUnit END\n";
}

and

#include <iostream>
using namespace std;

void runSecondUnit();

// Uncomment to fix all the errors:
//#define USE_NAMESPACE
#if defined( USE_NAMESPACE )
   namespace mySpace
   {
#endif

class A  {
   public:
   A() :  version( 2 )  {
      cerr << this << "   A::A()  --- 2\n";
   }

   virtual const char* getName() const {
      cerr << this << "   A::getName()  --- 2\n"; return "A";
   }

   virtual ~A()  {
      cerr << this << "   A::~A()  --- 2\n";
   }

   int version;
};


#if defined(USE_NAMESPACE )
   } // mySpace
   using namespace mySpace;
#endif

void runSecondUnit() {   
   A a;   
   // Reports 1. Not 2 as above!
   cerr << "   a.version = " << a.version << endl;
   cerr << "   a.getName()=='" << a.getName() << "'\n";    
}

Ok, ok. Obviously I shouldn't have declared two classes called A. My bad. But I bet you can't guess what happens next...

I compiled each unit, and linked the two object files (successfully) and ran. Hmm...

Here's the output (g++ 4.3.3):

firstUnit BEGIN
0x7fff0a318300   A::A()  --- 1
   a.version = 1
0x7fff0a318300   A::~A()  --- 1
firstUnit END
secondUnit BEGIN
0x7fff0a318300   A::A()  --- 1
   a.version = 1
0x7fff0a318300   A::getName()  --- 2
   a.getName()=='A'
0x7fff0a318300   A::~A()  --- 1
secondUnit END

So there are two separate A classes. In the second use, the destructor and constructor for the first on was used, even though only the second one was in visible in its compilation unit. Even more bizarre, if I uncomment the lines in runFirstUnit, instead of calling either A::~A, the A::getName is called. Clearly in the first use, the object gets the vtable for the second definition (getName is the second virtual function in the second class, the destructor the second in the first). And it even correcly gets the constructor from the first.

So my question is, why didn't the linker complain about the multiply defined symbols. It appears to choose the first match. Reordering the objects in the link step confirm.

The behavior is identical in Visual Studio, so I'm guessing that this is some standard-defined behavior. My question is, why? Clearly it would be easy for the linker to barf given the duplicate names. If I add,

 void f() {}

to both files it complains. Why not for my class constructors and destructors?

EDIT The problem isn't, "what should I have done to avoid this", or "how is the behavior explained". It is, "why don't linkers catch it?" Projects may have thousands of compile units. Sensible naming practices don't really solve this issue -- they only make the problem obscure and only then if you can train everyone to follow them.

The above example leads to ambiguous behavior that is easy and definitively solvable by compiler tools. So, why do they not? Is this simply a bug. (I suspect not.)

** EDIT ** See litb's answer below. I'm repeating is back to make sure my understanding's right:

Linkers only generate warnings for strong references. Because we have shared headers, inline function definitions (i.e. where declaration and definition is made at the same place, or template functions) are be compiled into multiple object files for each TU that sees them. Because there's no easy way to restrict the generation this code to a single object file, the linker has the job of choosing one of many definitions. So that errors are not generated by the linker, the symbols for these compiled definitions are tagged as weak references in the object file.

like image 625
user48956 Avatar asked Jan 23 '23 05:01

user48956


1 Answers

The compiler and linker relies on both classes to be exactly the same. In your case, they are different and so strange things happen. The one definition rule says that the result is undefined behavior - so behavior is not at all required to be consistent among compilers. . I suspect that in runFirstUnit, in the delete line, it puts a call to the first virtual table entry (because in its translation unit, the destructor may occupy the first entry).

In the second translation unit, this entry happens to point to A::getName, but in the first translation unit (where you execute the delete), the entry points to A::~A. Since these two are differently named (A::~A vs A::getName) you don't get a name clash (you will have code emitted for both the destructor and getName). But since their class name is the same, their v-tables will clash on purpose, because since both classes have the same name, the linker will think they are the same class and assume same contents.

Notice that all member functions were defined in-class, which means they are all inline functions. These functions can be defined multiple times in a program. In the case of in-class definitions, the rationale is that you may include the same class definition into different translation units from their header files. Your test function, however, isn't an inline function and thus including it into different translation units will triggers a linker error.

If you enable namespaces, there will be no clash what-so ever, because ::A and ::mySpace::A are different classes, and of course will get different v-tables.

like image 63
Johannes Schaub - litb Avatar answered Feb 05 '23 17:02

Johannes Schaub - litb