It is well-known that "Virtuals are resolved at run time only if the call is made through a reference or pointer.". Thus, it is surprising to me when I find the dereference operator also keeps the dynamic binding feature.
#include <iostream>
using namespace std;
struct B {
virtual void say() { cout << "Hello B" << endl; }
};
struct D : B {
void say() override { cout << "Hello D" << endl; }
};
int main() {
D *ptr = new D();
B *p = ptr;
(*p).say();
return 0;
}
The output is
Hello D
Question: What the compiler dealt with the dereference operator *?
I thought it is done in the compile time. Thus when the compiler deference the pointer p, it should assumes that p points to a object of type B. For example, the following code
D temp = (*p);
complains
error: no viable conversion from 'B' to 'D'
In computer programming, a dereference operator, also known as an indirection operator, operates on a pointer variable. It returns the location value, or l-value in memory pointed to by the variable's value. In the C programming language, the deference operator is denoted with an asterisk (*).
Dereferencing is used to access or manipulate data contained in memory location pointed to by a pointer. *(asterisk) is used with pointer variable when dereferencing the pointer variable, it refers to variable being pointed, so this is called dereferencing of pointers.
On the surface of it, this is an interesting question, because absent an overload of unary *
, dereferencing results in an lvalue B
, not a reference type. However, even starting to go down this line of reasoning is a red herring: expressions never have reference types, as the reference is immediately dropped and determines the value category. In that sense, the unary *
operator is very much like a function returning a reference
In fact, the answer is that your initial assertion is incorrect: dynamic dispatch does not at all rely on references or pointers. It is references and pointers that enable you to prevent slicing, but once you have some expression referring to your polymorphic object, any old function call will do.
Also consider:
#include <iostream>
struct Base
{
virtual void foo() { std::cout << "Base::foo()\n"; }
void bar() { foo(); }
};
struct Derived : Base
{
virtual void foo() { std::cout << "Derived::foo()\n"; }
};
int main()
{
Derived d;
d.bar(); // output: "Derived::foo()"
}
(live demo)
The derefencing/indirection operator *
doesn't itself actually do anything.
For example, when you write just *p;
the compiler may ignore this line if p
is just a pointer.
What the *
does is change the semantics of read and write:
int i = 42;
int* p = &i;
*p = 0;
p = 0;
The *p = 0
means write to the object p
points to.
Note that in C++, an object is a region of storage.
Similarly,
auto x = p; // copies the address
auto y = *p; // copies the value
Here, the read from *p
means read the value of the object p
points to.
The value category of *p
only determines which operations the C++ language allows
on expressions of the form *p
.
References are really just pointers with syntactic sugar.
So trying to explain what *p
does by using references is circular reasoning.
Let's consider slightly changed classes:
class Base
{
private:
int b = 21;
public:
virtual void say() { std::cout << "Hello B(" <<b<< ")\n"; }
};
class Derived : public Base
{
private:
int d = 1729;
public:
virtual void say() { std::cout << "Hello D(" <<d<< ")\n"; }
};
Derived d;
Derived *pd = &d;
Base* pb = pd;
One weird, but I think allowed memory layout looks like this:
$$2d graphics mode$$ +-Derived------------+ | +-Base---+----+ | | d | vtable | b | | | +--------+----+ | +----^---------------+ ^ | pb | pd $$1d graphics mode$$ name # /../ |d |vtable |b | address # /../ |0 1 2 3 |4 5 6 7 8 9 1011|12131415|16 ^ ^ | pd | pb pd == some address pb == pd + 4 byte
When we convert from Derived*
to Base*
, the compiler knows the offset
of the Base
subobject inside a Derived
object,
and can compute the address value for this subobject.
The vtable pointer is stored, for single nonvirtual inheritance, in the least derived type that has a virtual function. It is changed by derived classes roughly as seen in this implemenation/simulation.
When we now call
pb->say()
which is defined in the C++ Standard as
(*pb).say()
the compiler knows from the type of pb
(which is Base*
), that we call a virtual function.
Therefore, the (*pb).say()
means look up the entry for say
in the vtable
of the object pb
points to, and call it.
The part of the object pb
points to is what allows polymorphism.
On the other hand, when we copy
Base b = *pb;
What happens is that the vtable pointer is not copied.
This would be dangerous, because Derived::say
might try to access Derived::d
.
But this data member isn't available in an object of type Base
,
which we're currently creating (in the copy ctor of Base
).
After doing some research, I think I have a reasonable (at least to me) answer for this question to share.
Assumptions (excerpted or paraphrased from the book "C++ Primer 5th"):
(*p)
returns the object to which p
points.D: public B
logically has two parts, one is a sub-object of class B and the other part has members of class D. (This explains the "slicing The virtual mechanism of C++
I used to support this answer is from an article 12.5 The Virtual Table. It convinces me at least. Below is a figure conceptually shows the *__vptr
and the VTable
s of the code in our question.
D obj_d;
D* ptr = &obj_d; // ptr is a pointer to type D,
// and points to obj_d, an object of type D
B* p = ptr; // p is a pointer to type B and p points to the B subobject of obj_d.
(*p).say();
Since p
is a pointer to type B
, (*p)
returns an object of type B
,
i.e., the sub-object of (*ptr)
. Name this object of type B
as obj_b
.
However, the *__vptr
of obj_b
points to the VTable of D
. Thus, when it calls
say()
, the function pointer of say()
in the VTable of D
points to the method that prints
"Hello D"
(&(*p))->say(); // outputs "Hello D"
During calling a method of an object x
, whether the polymorphism (dynamic binding of class members) happens depends on which VTable the *__vptr
of that object it points to.
If we write B obj_x(*p); (&obj_x)->say();
the output is "Hello B". This is because obj_x is a completely newly constructed object of type B using the synthesized copy constructor of struct B. Thus, the *__vptr
of obj_x points to the VTable of B.
Thanks to the help from dyp, we have a simulation of the virtual dispatch of this question. In case of the webpage is removed by Coiliru, I stored the code here.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With