Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Mismatch of 'this' address when base class is not polymorphic but derived is

There is this code:

#include <iostream>

class Base
{
public:
    Base() {
        std::cout << "Base: " << this << std::endl;
    }
    int x;
    int y;
    int z;
};

class Derived : Base
{
public:
    Derived() {
        std::cout << "Derived: " << this << std::endl;
    }

    void fun(){}
};

int main() {
   Derived d;
   return 0;
}

The output:

Base: 0xbfdb81d4
Derived: 0xbfdb81d4

However when function 'fun' is changed to virtual in Derived class:

virtual void fun(){} // changed in Derived

Then address of 'this' is not the same in both constructors:

Base: 0xbf93d6a4
Derived: 0xbf93d6a0

The other thing is if class Base is polymorphic, for example I added there some other virtual function:

virtual void funOther(){} // added to Base

then addresses of both 'this' match again:

Base: 0xbfcceda0
Derived: 0xbfcceda0

The question is - why 'this' address is different in Base and Derived class when Base class is not polymorphic and Derived class is?

like image 684
scdmb Avatar asked Jul 21 '12 16:07

scdmb


1 Answers

When you have a polymorphic single-inheritance hierarchy of classes, the typical convention followed by most (if not all) compilers is that each object in that hierarchy has to begin with a VMT pointer (a pointer to Virtual Method Table). In such case the VMT pointer is introduced into the object memory layout early: by the root class of the polymorphic hierarchy, while all lower classes simply inherit it and set it to point to their proper VMT. In such case all nested subobjects within any derived object have the same this value. That way by reading a memory location at *this the compiler has immediate access to VMT pointer regardless of the actual subobject type. This is exactly what happens in your last experiment. When you make the root class polymorphic, all this values match.

However, when the base class in the hierarchy is not polymorphic, it does not introduce a VMT pointer. The VMT pointer will be introduced by the very first polymorphic class somewhere lower in the hierarchy. In such case a popular implementational approach is to insert the VMT pointer before the data introduced by the non-polymorphic (upper) part of the hierarchy. This is what you see in your second experiment. The memory layout for Derived looks as follows

+------------------------------------+ <---- `this` value for `Derived` and below
| VMT pointer introduced by Derived  |
+------------------------------------+ <---- `this` value for `Base` and above
| Base data                          |
+------------------------------------+
| Derived data                       |
+------------------------------------+

Meanwhile, all classes in the non-polymorphic (upper) part of the hierarchy should know nothing about any VMT pointers. Objects of Base type must begin with data field Base::x. At the same time all classes in the polymorphic (lower) part of the hierarchy must begin with VMT pointer. In order to satisfy both of these requirements, the compiler is forced to adjust the object pointer value as it is converted up and down the hierarchy from one nested base subobject to another. That immediately means that pointer conversion across the polymorphic/non-polymorphic boundary is no longer conceptual: the compiler has to add or subtract some offset.

The subobjects from non-polymorphic part of the hierarchy will share their this value, while subobjects from the polymorphic part of hierarchy will share their own, different this value.

Having to add or subtract some offset when converting pointer values along the hierarchy is not unusual: the compiler has to do it all the time when dealing with multiple-inheritance hierarchies. However, you example shows how it can be achieved in single-inheritance hierarchy as well.

The addition/subtraction effect will also be revealed in a pointer conversion

Derived *pd = new Derived;
Base *pb = pd; 
// Numerical values of `pb` and `pd` are different if `Base` is non-polymorphic
// and `Derived` is polymorphic

Derived *pd2 = static_cast<Derived *>(pb);
// Numerical values of `pd` and `pd2` are the same
like image 51
AnT Avatar answered Sep 28 '22 09:09

AnT