Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why do I have to use a dynamic_cast here

I noticed that if I use C style casting (or reinterpret_cast) in the code below, I get a segmentation fault exception but if I use a dynamic_cast, it is OK. Why is this? Since I'm sure that the pointer a is of type B because the Add method already makes sure the input is of type B.

Do I have to use dynamic_cast here even though I already guarantee that pointer a is of type B through my implementation?

Edit:

I do realize it's a bad practice in general to use C style casting (or reinterpret_cast). But for THIS particular case, why do they not work.

This has an application in practice because if class B is an interface and class D is forced to store type A pointer due to some reason. Dynamic cast is forced to be used here when the implementation already guarantees the type safety of the interface type.

#include <iostream>

using namespace std;

class A
{
    public:
    virtual ~A() = default;
};

class B
{
    public:
    virtual string F() = 0;
};

class C : public A, public B
{
    public:
    virtual ~C() = default;
    virtual string F() { return "C";}
};

class D
{
    public:

    D() : a(nullptr) {}

    void Add(B* b)
    {
        A* obj = dynamic_cast<A*>(b);
        if(obj != nullptr)
            a = obj;
    }

    B* Get()
    {
        return (B*)(a); // IF I USE DYNAMIC CAST HERE, IT'D BE OK
    }

    private:
    A* a;
};

int main()
{
    D d;
    d.Add(new C());

    B* b = d.Get();
    if(b != nullptr)
        cout << b->F();
}
like image 246
jahithber Avatar asked Feb 07 '20 19:02

jahithber


3 Answers

tl;dr: c-style casts are sneaky and can easily introduce bugs.

So what's happening in this expression?

class A
{
    public:
    virtual ~A() = default;
};

class B
{
    public:
    virtual string F() = 0;
};

B* Get()
{
    return (B*)(a);
}

Notice that A and B are unrelated.

What if you used a proper static_cast instead?

B* Get()
{
    return static_cast<B*>(a);
}

You'll then see a proper diagnostic:

error: invalid 'static_cast' from type 'A*' to type 'B*'
            return static_cast<B*>(a);
                   ^~~~~~~~~~~~~~~~~~

Oh no.

Indeed, c-style casts fallback with reinterpret_cast when a static one can't be done. So your code is equivalent to:

B* Get()
{
    return reinterpret_cast<B*>(a);
}

Which is not what you want. This is not the cast you're looking for.

The A subobject has a different address than the B subobject, mainly to make place for the vtable.

What exactly is reinterpret_cast doing here?

Not a lot, really. It just tell the compiler to intepret the memory address sent to it as another type. It only work if there the type you ask for has a lifetime at that address. In you case this is not true, there's a A object at that place, the B part of your object is elsewhere in memory.

A static cast will adjust the pointer to make sure it points to the right offset in memory for that type, and fail to compile if it can't compute the offset.

C* c = new C();
cout << c;
cout << "\n";

A* a = dynamic_cast<A*>(c);
cout << a;
cout << "\n";

B* b = dynamic_cast<B*>(c);
cout << b;
cout << "\n";

Will give you something similar:

0xbe3c20
0xbe3c20
0xbe3c28

What can you do then?

If you want to use static casts, you'll have to go through C since it's the only place the compiler can see the relationship between A and B:

B* Get()
{
    return static_cast<B*>(static_cast<C*>(a));
}

Or if you don't know if C is the runtime type of the object a is pointing to, then you must use dynamic_cast.

like image 78
Guillaume Racicot Avatar answered Oct 11 '22 00:10

Guillaume Racicot


Please allow me to start by quoting a few lines of the question's code to establish context.

A* a;
return (B*)(a);

Why does a C-style cast fail?

When casting pointers (to objects), a C-style cast has the same functionality as a reinterpret_cast plus the ability to cast away const and volatile. See below for why reinterpret_cast fails.

Why does a reinterpret_cast fail?

A reinterpret_cast tells the compiler to treat the expression as if it had the new type. The same bit pattern is used, just interpreted differently. This is a problem when dealing with your compound object.

The object in question is of type C, which is derived from both A and B. Neither A nor B has objects of zero size, which is a key factor. (The classes may look empty, but since they have virtual functions, each object of those classes contains a pointer to a virtual function table.) Here is one possible layout, where we assume the size of a pointer is 8:

----------------------------------
| C : | A : pointer to A's table |  <-- Offset 0
|     | B : pointer to B's table |  <-- Offset 8
----------------------------------

Your code starts with a pointer to C, which eventually gets stored as a pointer to A. With the above picture, these addresses happen to be numerically equal. So far, so good. Then you take this address and tell the compiler to believe it is a pointer to B, even though the B sub-object is offset by 8 bytes. So when you go to call b->F(), the program looks up the address of F in the virtual function table of A! Even if that happens to yield a valid function pointer, you are looking at a segmentation fault if the signature of that function does not match that of B::F. (In other words, expect a crash.)

On a more pedantic note, since A and B are unrelated types, using the pointer produced by your cast results in undefined behavior. The above merely explains what typically happens in this case, but technically the standard would allow the outcome "my computer exploded".

Why does a dynamic_cast work?

In short, dynamic_cast will add 8 to the pointer at the key time. What you are attempting is known as a "sidecast", which is one of the things dynamic_cast is designed to do. (It's 5b in cppreference's explanation of dynamic_cast.) The dynamic_cast will recognize that what a points to is really of type C (the most-derived type) and that C has an unambiguous base of type B. So the cast calculates the difference between the offsets of A and B within objects of C, and adjusts the pointer. The offset of B is 8, while the offset of A is 0, so the pointer is adjusted by 8-0, resulting in a valid pointer to B.

Once the pointer to B actually points to an object of type B, calling a virtual function of B works.

Using static_cast to go from C* to B* works similarly, but if course, you don't have a C* to work with in this case.

like image 30
JaMiT Avatar answered Oct 10 '22 23:10

JaMiT


C-style casts are very dangerous, which is why in c++ we have static_cast and reinterpret_cast, (as well as the dynamic-cast which is c++ only)

reinterpret_cast is equally as dangerous as c-style casts, and will simply take the address for your B* and give you the same address as an A*, NOT what you want.

static_cast requires that the source and destination types are related. You can't simply dynamic_cast a B* to an A* because they are unrelated. However it doesn't do any other checking, just applies a simple fixed mathematical rule to the address.

You could static_cast to C* and then to A*, and that would be legal and safe so long as you are certain that your object is a C, otherwise it will go horribly wrong, even if this other object has an A and a B element, if it has other elements as well, these two may be at different offsets, and the fixed math will give a wrong answer.

dynamic_cast effectively asks the object itself to help. It is hosted by the C* implementation which knows both A and B types. If it was a different implementation object , that object would resolve the appropriate answer.

like image 24
Gem Taylor Avatar answered Oct 11 '22 00:10

Gem Taylor