Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Shared vtables between classes of same name: call to virtual method crashes when casting to base type

Tags:

c++

Check below for UPDATE, I could reproduce and need help.

I have a strange crash where some method works fine everywhere except in 1 place. Here's the code:

struct base
{
    virtual wchar_t* get() = 0; // can be { return NULL; } doesn't matter
};

struct derived: public base
{
    virtual wchar_t* get() { return SomeData(); }
};

struct container 
{
    derived data;
};

// this is approx. how it is used in real program
void output(const base& data) 
{ 
     data.get(); 
}

smart_ptr<container> item = GetItSomehow();
derived &v1 = item->data;
v1.get(); // works OK
//base &v2 = (base&)derived; // the old line, to understand old comments in the question
base &v2 = v1; // or base* v2 doesn't matter
v2.get(); // segmentation fault without going into method at all

Now, as I said, I call item->data.get() in many places on different objects and it works... always. Except for 1 place. But there it doesn't work only if casted to base class (output is an example why it is needed).

Now, the question is - HOW and WHY this can happen? I'd suspect pure virtual call but I don't call virtual method in the constructor. I don't see how the calls are different. I would suspect base method is abstract but it is same if I add a body to it.

I cannot provide a small example to test because, as I said, it works always, except for 1 place. If I knew why it doesn't work there, I wouldn't need the test sample because that would already be the answer...

P.S. The environment is Ubuntu 11.10 x64 but the program is compiled for 32 bit using gcc 4.5.2 custom build.

P.P.S. Another clue, not sure if related...

warning: can't find linker symbol for virtual table for `derived::get' value
warning:   found `SomeOtherDerivedFromBaseClass::SomeOtherCrazyFunction' instead

in the real program

UPDATE: Any chance this can happen because of gcc linking vtable to a wrong class with same name but inside different shared library? The "derived" class in real app actually defined in several shared libraries, and worse, there's another similar class with same name but different interface. What's strange is that without casting to base class it works.

I am especially interested in gcc/linking/vtables details here.

Here's how I seem to reproduce:

// --------- mod1.h
class base
{
public:
   virtual void test(int i); // add method to make vtables different with mod2
   virtual const char* data();
};

class test: public base
{
public:
   virtual const char* data();
};


// --------- mod2.h
class base
{
public:
   virtual const char* data();
};

class test: public base
{
public:
   virtual const char* data();
};

// --------- mod2.cpp
#include "mod2.h"
const char* base::data() { return "base2"; }
const char* test::data() { return "test2"; }

// --------- modtest.cpp
#include <stdio.h>
// !!!!!!!!! notice that we include mod1
#include "mod1.h"

int main()
{
   test t;
   base& b = t;
   printf("%s\n", t.data());
   printf("%s\n", b.data());
   return 0;
}

// --------- how to compile and run
g++ -c mod2.cpp && g++ mod2.o modtest.cpp  && ./a.out

// --------- output from the program
queen3@pro-home:~$ ./a.out 
test2
Segmentation fault

In the modtest above, if we include "mod2.h" instead of "mod1.h", we get normal "test2\ntest2" output without segfault.

The question is - what is the exact mechanism for this? How to detect and prevent? I knew that static data in gcc will be linked to single memory entry, but vtables...

like image 364
queen3 Avatar asked Nov 30 '22 06:11

queen3


1 Answers

Edit in response to update: In your updated code where you use mod1 and mod2 header you're violating the One Definition Rule for classes (even by appearing in shared libraries). It basically states that in your entire program you must have only one definition of a class (base in this case) although the same definition can appear in multiple source files. If you have more than one definition then all bets are off and you get undefined behavior. In this case, the undefined behavior happens to be a crash. The fix is of course to not have multiple versions of the same class in the same program. This is usually accomplished by defining each class in a single header (or implementation for non-API/impl classes) and including that header where the class definition is needed.

Original answer: If it works everywhere except one place it sounds like the object isn't valid in that one place (working as derived pointer but not as base sounds a lot like you entered the realm of undefined behavior). Either it's corrupted memory, a deleted object pointer, or something else. Your best bet is if you can run valgrind on it.

like image 125
Mark B Avatar answered Dec 10 '22 08:12

Mark B