Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Virtual tables and memory layout in multiple virtual inheritance

Consider following hierarchy:

struct A {
   int a; 
   A() { f(0); }
   A(int i) { f(i); }
   virtual void f(int i) { cout << i; }
};
struct B1 : virtual A {
   int b1;
   B1(int i) : A(i) { f(i); }
   virtual void f(int i) { cout << i+10; }
};
struct B2 : virtual A {
   int b2;
   B2(int i) : A(i) { f(i); }
   virtual void f(int i) { cout << i+20; }
};
struct C : B1, virtual B2 {
   int c;
   C() : B1(6),B2(3),A(1){}
   virtual void f(int i) { cout << i+30; }
};
  1. What's the exact memory layout of C instance? How many vptrs it contains, where exactly each of them is placed? Which of virtual tables are shared with virtual table of C? What exactly each virtual table contains?

    Here how I understand the layout:

    ----------------------------------------------------------------
    |vptr1 | AptrOfB1 | b1 | B2ptr | c | vptr2 | AptrOfB2 | b2 | a |
    ----------------------------------------------------------------
    

    where AptrOfBx is the pointer to A instance that Bx contains (since the inheritance is virtual).
    Is that correct? Which functions vptr1 points to? Which functions vptr2 points to?

  2. Given the following code

    C* c = new C();
    dynamic_cast<B1*>(c)->f(3);
    static_cast<B2*>(c)->f(3);
    reinterpret_cast<B2*>(c)->f(3);
    

    Why all the calls to f print 33?

like image 997
JeB Avatar asked Jul 22 '12 19:07

JeB


People also ask

Can we include virtual classes in multiple inheritance?

Virtual base classes are used in virtual inheritance in a way of preventing multiple “instances” of a given class appearing in an inheritance hierarchy when using multiple inheritances.

Why virtual classes are important in the case of multiple inheritance?

When you use multiple inheritances, a virtual base class in C++ is used to avoid multiple "instances" of a given class from occurring in an inheritance hierarchy. Now, in this article, you will explore the virtual base class in C++ as a whole. Consider the case where there is just one class A.

What is virtual multiple inheritance?

Virtual inheritance is a C++ technique that ensures only one copy of a base class's member variables are inherited by grandchild derived classes.

What happens if we don't use a virtual function in the inheritance?

If you don't use virtual functions, you don't understand OOP yet. Because the virtual function is intimately bound with the concept of type, and type is at the core of object-oriented programming, there is no analog to the virtual function in a traditional procedural language.


2 Answers

Virtual bases are very different from ordinary bases. Remember that "virtual" means "determined at runtime" -- thus the entire base subobject must be determined at runtime.

Imagine that you are getting a B & x reference, and you are tasked to find the A::a member. If the inheritance were real, then B has a superclass A, and thus the B-object which you are viewing through x has an A-subobject in which you can locate your member A::a. If the most-derived object of x has multiple bases of type A, then you can only see that particular copy which is the subobject of B.

But if the inheritance is virtual, none of this makes sense. We don't know which A-subobject we need -- this information simply doesn't exist at compile time. We could be dealing with an actual B-object as in B y; B & x = y;, or with a C-object like C z; B & x = z;, or something entirely different that derives virtually from A many more times. The only way to know is to find the actual base A at runtime.

This can be implemented with one more level of runtime indirection. (Note how this is entirely parallel to how virtual functions are implemented with one extra level of runtime indirection compared to non-virtual functions.) Instead of having a pointer to a vtable or base subobject, one solution is to store a pointer to a pointer to the actual base subobject. This is sometimes called a "thunk" or "trampoline".

So the actual object C z; may look as follows. The actual ordering in memory is up to the compiler and unimportant, and I've suppressed vtables.

+-+------++-+------++-----++-----+
|T|  B1  ||T|  B2  ||  C  ||  A  |
+-+------++-+------++-----++-----+
 |         |                 |
 V         V                 ^
 |         |       +-Thunk-+ |
 +--->>----+-->>---|     ->>-+
                   +-------+

Thus, no matter whether you have a B1& or a B2&, you first look up the thunk, and that one in turn tells you where to find the actual base subobject. This also explains why you cannot perform a static cast from an A& to any of the derived types: this information simply doesn't exist at compile time.

For a more in-depth explanation, take a look at this fine article. (In that description, the thunk is part of the vtable of C, and virtual inheritance always necessitates the maintenance of vtables, even if there are no virtual functions anywhere.)

like image 145
Kerrek SB Avatar answered Oct 18 '22 20:10

Kerrek SB


I have pimped your code a bit as follows:

#include <stdio.h>
#include <stdint.h>

struct A {
   int a; 
   A() : a(32) { f(0); }
   A(int i) : a(32) { f(i); }
   virtual void f(int i) { printf("%d\n", i); }
};

struct B1 : virtual A {
   int b1;
   B1(int i) : A(i), b1(33) { f(i); }
   virtual void f(int i) { printf("%d\n", i+10); }
};

struct B2 : virtual A {
   int b2;
   B2(int i) : A(i), b2(34) { f(i); }
   virtual void f(int i) { printf("%d\n", i+20); }
};

struct C : B1, virtual B2 {
   int c;
   C() : B1(6),B2(3),A(1), c(35) {}
   virtual void f(int i) { printf("%d\n", i+30); }
};

int main() {
    C foo;
    intptr_t address = (intptr_t)&foo;
    printf("offset A = %ld, sizeof A = %ld\n", (intptr_t)(A*)&foo - address, sizeof(A));
    printf("offset B1 = %ld, sizeof B1 = %ld\n", (intptr_t)(B1*)&foo - address, sizeof(B1));
    printf("offset B2 = %ld, sizeof B2 = %ld\n", (intptr_t)(B2*)&foo - address, sizeof(B2));
    printf("offset C = %ld, sizeof C = %ld\n", (intptr_t)(C*)&foo - address, sizeof(C));
    unsigned char* data = (unsigned char*)address;
    for(int offset = 0; offset < sizeof(C); offset++) {
        if(!(offset & 7)) printf("| ");
        printf("%02x ", (int)data[offset]);
    }
    printf("\n");
}

As you see, this prints quite a bit of additional information that allows us to deduce the memory layout. The output on my machine (a 64-bit linux, little endian byte order) is this:

1
23
16
offset A = 16, sizeof A = 16
offset B1 = 0, sizeof B1 = 32
offset B2 = 32, sizeof B2 = 32
offset C = 0, sizeof C = 48
| 00 0d 40 00 00 00 00 00 | 21 00 00 00 23 00 00 00 | 20 0d 40 00 00 00 00 00 | 20 00 00 00 00 00 00 00 | 48 0d 40 00 00 00 00 00 | 22 00 00 00 00 00 00 00 

So, we can describe the layout as follows:

+--------+----+----+--------+----+----+--------+----+----+
|  vptr  | b1 | c  |  vptr  | a  | xx |  vptr  | b2 | xx |
+--------+----+----+--------+----+----+--------+----+----+

Here, xx denotes padding. Note how the compiler has placed the variable c into the padding of its non-virtual base. Note also, that all three v-pointers are different, this allows the program to deduce the correct positions of all the virtual bases.

like image 34
cmaster - reinstate monica Avatar answered Oct 18 '22 20:10

cmaster - reinstate monica