I'm reading this article "Virtual method table"
Example in the above article:
class B1 {
public:
void f0() {}
virtual void f1() {}
int int_in_b1;
};
class B2 {
public:
virtual void f2() {}
int int_in_b2;
};
class D : public B1, public B2 {
public:
void d() {}
void f2() {} // override B2::f2()
int int_in_d;
};
B2 *b2 = new B2();
D *d = new D();
In the article, the author introduces that the memory layout of object d
is like this:
d:
D* d--> +0: pointer to virtual method table of D (for B1)
+4: value of int_in_b1
B2* b2--> +8: pointer to virtual method table of D (for B2)
+12: value of int_in_b2
+16: value of int_in_d
Total size: 20 Bytes.
virtual method table of D (for B1):
+0: B1::f1() // B1::f1() is not overridden
virtual method table of D (for B2):
+0: D::f2() // B2::f2() is overridden by D::f2()
The question is about d->f2()
. The call to d->f2()
passes a B2
pointer as a this
pointer so we have to do something like:
(*(*(d[+8]/*pointer to virtual method table of D (for B2)*/)[0]))(d+8) /* Call d->f2() */
Why should we pass a B2
pointer as the this
pointer not the original D
pointer??? We are actually calling D::f2(). Based on my understanding, we should pass a D
pointer as this
to D::f2() function.
___update____
If passing a B2
pointer as this
to D::f2(), What if we want to access the members of B1
class in D::f2()?? I believe the B2
pointer (this) is shown like this:
d:
D* d--> +0: pointer to virtual method table of D (for B1)
+4: value of int_in_b1
B2* b2--> +8: pointer to virtual method table of D (for B2)
+12: value of int_in_b2
+16: value of int_in_d
It already has a certain offset of the beginning address of this contiguous memory layout. For example, we want to access b1
inside D::f2(), I guess in runtime, it will do something like: *(this+4)
(this
points to the same address as b2) which would points b2
in B
????
Virtual inheritance is used when we are dealing with multiple inheritance but want to prevent multiple instances of same class appearing in inheritance hierarchy. From above example we can see that “A” is inherited two times in D means an object of class “D” will contain two attributes of “a” (D::C::a and D::B::a).
Virtual inheritance is a C++ technique that ensures only one copy of a base class's member variables are inherited by grandchild derived classes.
Base classes can't inherit what the child has (such as a new function or variable). Virtual functions are simply functions that can be overridden by the child class if the that child class changes the implementation of the virtual function so that the base virtual function isn't called. A is the base class for B,C,D.
You can imagine what happens when you perform inheritance and override some of the virtual functions. The compiler creates a new VTABLE for your new class, and it inserts your new function addresses using the base-class function addresses for any virtual functions you don't override.
We cannot pass the D
pointer to a virtual function overriding B2::f2()
, because all overrides of the same virtual function must accept identical memory layout.
Since B2::f2()
function expects B2
's memory layout of the object being passed to it as its this
pointer, i.e.
b2:
+0: pointer to virtual method table of B2
+4: value of int_in_b2
the overriding function D::f2()
must expect the same layout as well. Otherwise, the functions would no longer be interchangeable.
To see why interchangeability matters consider this scenario:
class B2 {
public:
void test() { f2(); }
virtual void f2() {}
int int_in_b2;
};
...
B2 b2;
b2.test(); // Scenario 1
D d;
d.test(); // Scenario 2
B2::test()
needs to make a call of f2()
in both scenarios. It has no additional information to tell it how this
pointer has to be adjusted when making these calls*. That is why the compiler passes the fixed-up pointer, so test()
's call of f2
would work both with D::f2()
and B2::f2()
.
* Other implementations may very well pass this information; however, multiple inheritance implementation discussed in the article does not do it.
Given your class hierarchy, an object of type B2
will have the following memory footprint.
+------------------------+
| pointer for B2 vtable |
+------------------------+
| int_in_b2 |
+------------------------+
An object of type D
will have the following memory footprint.
+------------------------+
| pointer for B1 vtable |
+------------------------+
| int_in_b1 |
+------------------------+
| pointer for B2 vtable |
+------------------------+
| int_in_b2 |
+------------------------+
| int_in_d |
+------------------------+
When you use:
D* d = new D();
d->f2();
That call is the same as:
B2* b = new D();
b->f2();
f2()
can be called using a pointer of type B2
or pointer of type D
. Given that the runtime must be able to correctly work with a pointer of type B2
, it has to be able to correctly dispatch the call to D::f2()
using the appropriate function pointer in B2
's vtable. However, when the call is dispatched to D:f2()
the original pointer of type B2
must somehow be offset properly so that in D::f2()
, this
points to a D
, not a B2
.
Here's your example code, altered a little bit to print useful pointer values and member data to help understand the changes to the value of this
in various functions.
#include <iostream>
struct B1
{
void f0() {}
virtual void f1() {}
int int_in_b1;
};
struct B2
{
B2() : int_in_b2(20) {}
void test_f2()
{
std::cout << "In B::test_f2(), B*: " << (void*)this << std::endl;
this->f2();
}
virtual void f2()
{
std::cout
<< "In B::f2(), B*: " << (void*)this
<< ", int_in_b2: " << int_in_b2 << std::endl;
}
int int_in_b2;
};
struct D : B1, B2
{
D() : int_in_d(30) {}
void d() {}
void f2()
{
// ======================================================
// If "this" is not adjusted properly to point to the D
// object, accessing int_in_d will lead to undefined
// behavior.
// ======================================================
std::cout
<< "In D::f2(), D*: " << (void*)this
<< ", int_in_d: " << int_in_d << std::endl;
}
int int_in_d;
};
int main()
{
std::cout << "sizeof(void*) : " << sizeof(void*) << std::endl;
std::cout << "sizeof(int) : " << sizeof(int) << std::endl;
std::cout << "sizeof(B1) : " << sizeof(B1) << std::endl;
std::cout << "sizeof(B2) : " << sizeof(B2) << std::endl;
std::cout << "sizeof(D) : " << sizeof(D) << std::endl << std::endl;
B2 *b2 = new B2();
D *d = new D();
b2->test_f2();
d->test_f2();
return 0;
}
Output of the program:
sizeof(void*) : 8
sizeof(int) : 4
sizeof(B1) : 16
sizeof(B2) : 16
sizeof(D) : 32
In B::test_f2(), B*: 0x1f50010
In B::f2(), B*: 0x1f50010, int_in_b2: 20
In B::test_f2(), B*: 0x1f50040
In D::f2(), D*: 0x1f50030, int_in_d: 30
When the actual object used to call test_f2()
is D
, the value of this
changes from 0x1f50040
in test_f2()
to 0x1f50030
in D::f2()
. That matches with sizeof B1
, B2
, and D
. The offset of B2
sub-object of a D
object is 16 (0x10)
. The value of this
in B::test_f2()
, a B*
, is changed by 0x10
before the call is dispatched to D::f2()
.
I am going to guess that the value of the offset from D
to B2
is stored in B2
's vtable. Otherwise, there is no way a generic function dispatch mechanism can change the value of this
properly before dispatching the call to the right virtual function.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With