Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do upcasting and vtables work together to ensure correct dynamic binding?

Tags:

So, vtable is a table maintained by the compiler which contains function pointers that point to the virtual functions in that class.

and

Assigning a derived class's object to an ancestor class's object is called up-casting.

Up-casting is handling a derived class instance/object using a base class pointer or reference; the objects are not "assigned to", which implies an overwriting of value ala operator= invocation.
(Thanks to: Tony D)

Now, how it is known at run time "which" class's virtual function is supposed to be called?

Which entry in vtable refers to the function of "particular" derived classes which is supposed to be called at run time?

like image 580
Aquarius_Girl Avatar asked Nov 05 '14 06:11

Aquarius_Girl


People also ask

What will be the functionalities of VPTR and Vtables?

The compiler places the addresses of the virtual functions for that particular class in the VTABLE. In each class with virtual functions, it secretly places a pointer, called the vpointer (abbreviated as VPTR), which points to the VTABLE for that object.

How do virtual functions enable dynamic binding?

By default, C++ matches a function call with the correct function definition at compile time. This is called static binding. You can specify that the compiler match a function call with the correct function definition at runtime; this is called dynamic binding.

How are Vtables implemented?

Common implementation: Each object has a pointer to a vtable; the class owns the table. The construction magic simply consists of updating the vtable pointer in the derived ctor, after the base ctor has finished.


2 Answers

You can imagine (although the C++ specification doesn't say this) that the vtable is an identifier (or some other metadata that can be used to "find more information" about the class itself) and a list of functions.

So, if we have a class like this:

class Base {   public:      virtual void func1();      virtual void func2(int x);      virtual std::string func3();      virtual ~Base();    ... some other stuff we don't care about ...  }; 

The compiler will then produce a VTable something like this:

struct VTable_Base {    int identifier;    void (*func1)(Base* this);    void (*func2)(Base* this, int x);    std::string (*func3)(Base* this);     ~Base(Base *this); }; 

The compiler will then create an internal structure that, something like this (this is not possible to compile as C++, it's just to show what the compiler actually does - and I call it Sbase to differntiate the actual class Base)

struct SBase {    VTable_Base* vtable;    inline void func1(Base* this) { vtable->func1(this); }    inline void func2(Base* this, int x) { vtable->func2(this, x); }    inline std::string func3(Base* this) { return vtable->func3(this); }    inline ~Base(Base* this) { vtable->~Base(this); } }; 

It also builds the real vtable:

VTable_Base vtable_base =  {     1234567, &Base::func1, &Base::func2, &Base::func3, &Base::~Base  }; 

And in the constructor for Base, it will set the vtable = vtable_base;.

When we then add a derived class, where we override one function (and by default, the destructor, even if we don't declare one) :

class Derived : public Base {     virtual void func2(int x) override;  }; 

The compiler will now make this structure:

struct VTable_Derived {    int identifier;    void (*func1)(Base* this);    void (*func2)(Base* this, int x);    std::string (*func3)(Base* this);     ~Base(Derived *this); }; 

and then does the same "structure" building:

struct SDerived {    VTable_Derived* vtable;    inline void func1(Base* this) { vtable->func1(this); }    inline void func2(Base* this, int x) { vtable->func2(this, x); }    inline std::string func3(Base* this) { return vtable->func3(this); }    inline ~Derived(Derived* this) { vtable->~Derived(this); } }; 

We need this structure for when we are using Derived directly rather than through the Base class.

(We rely on the compiler chainin the ~Derived to call ~Base too, just like normal destructors that inherit)

And finally, we build an actual vtable:

VTable_Derived vtable_derived =  {     7654339, &Base::func1, &Derived::func2, &Base::func3, &Derived::~Derived  }; 

And again,the Derived constructor will set Dervied::vtable = vtable_derived for all instances.

Edit to answer question in comments: The compiler has to carefully place the various components in both VTable_Derived and SDerived such that it matches VTable_Base and SBase, so that when we have a pointer to Base, the Base::vtable and Base::funcN() are matching Derived::vtable and Derived::FuncN. If that doesn't match up, then the inheritance won't work.

If new virtual functions are added to Derived, they must then be placed after the ones inherited from Base.

End Edit.

So, when we do:

Base* p = new Derived;  p->func2();  

the code will look up SBase::Func2, which will use the correct Derived::func2 (because the actual vtable inside p->vtable is VTable_Derived (as set by the Derived constructor that is called in conjunction with the new Derived).

like image 115
Mats Petersson Avatar answered Jan 03 '23 10:01

Mats Petersson


I'll take a different route from the other answers and try to fill just the specific gaps in your knowledge, without going very much into the details. I'll address the mechanics just enough to help your understanding.

So, vtable is a table maintained by the compiler which contains function pointers that point to the virtual functions in that class.

The more precise way to say this is as follows:

Every class with virtual methods, including every class that inherits from a class with virtual methods, has its own virtual table. The virtual table of a class points to the virtual methods specific to that class, i.e. either inherited methods, overridden methods or newly added methods. Every instance of such a class contains a pointer to the virtual table that matches the class.

Up-casting is handling a derived class instance/object using a base class pointer or reference; (...)

Perhaps more enlightening:

Up-casting means that a pointer or reference to an instance of class Derived is treated as if it were a pointer or reference to an instance of class Base. The instance itself, however, is still purely an instance of Derived.

(When a pointer is "treated as a pointer to Base", that means that the compiler generates code for dealing with a pointer to Base. In other words, the compiler and the generated code know no better than that they are dealing with a pointer to Base. Hence, a pointer that is "treated as" will have to point to an object that offers at least the same interface as instances of Base. This happens to be the case for Derived because of inheritance. We'll see how this works out below.)

At this point we can answer the first version of your question.

Now, how it is known at run time "which" class's virtual function is supposed to be called?

Suppose we have a pointer to an instance of Derived. First we upcast it, so it is treated as a pointer to an instance of Base. Then we call a virtual method upon our upcasted pointer. Since the compiler knows that the method is virtual, it knows to look for the virtual table pointer in the instance. While we are treating the pointer as if it points to an instance of Base, the actual object has not changed value and the virtual table pointer within it is still pointing to the virtual table of Derived. So at runtime, the address of the method is taken from the virtual table of Derived.

Now, the particular method may be inherited from Base or it might be overridden in Derived. It does not matter; if inherited, the method pointer in the virtual table of Derived simply contains the same address as the corresponding method pointer in the virtual table of Base. In other words, both tables are pointing to the same method implementation for that particular method. If overridden, the method pointer in the virtual table of Derived differs from the corresponding method pointer in the virtual table of Base, so method lookups on instances of Derived will find the overridden method while lookups on instances of Base will find the original version of the method — regardless of whether a pointer to the instance is treated as a pointer to Base or a pointer to Derived.

Finally, it should now be straightforward to explain why the second version of your question is a bit misguided:

Which entry in vtable refers to the function of "particular" derived classes which is supposed to be called at run time?

This question presupposes that vtable lookups are first by method and then by class. It is the other way round: first, the vtable pointer in the instance is used to find the vtable for the right class. Then, the vtable for that class is used to find the right method.

like image 40
Julian Avatar answered Jan 03 '23 12:01

Julian