Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

If Derived adds no new members to Base (and is POD), then what kind of pointer casts, and dereferencing, can be safely done?

(This is another question about undefined behaviour (UB). If this code 'works' on some compiler, then that means nothing in the land of UB. That is understood. But exactly at what line below do we cross into UB?)

(There are a number of very similar questions on SO already, e.g. (1) but I'm curious what can be safely done with the pointers before dereferencing them.)

Start off with a very simple Base class. No virtual methods. No inheritance. (Maybe this can be extended to anything that's POD?)

struct Base {
        int first;
        double second;
};

And then a simple extension that adds (non-virtual) methods and doesn't add any members. No virtual inheritance.

struct Derived : public Base {
        int foo() { return first; }
        int bar() { return second; }
};

Then, consider the following lines. If there is some deviation from defined behaviour, I'd be curious to know which lines exactly. My guess is that we can safely perform much of the calculations on the pointers. Is it possible that some of these pointer calculations, if not fully defined, at least give us some sort of 'indeterminate/unspecified/implementation-defined' value that isn't entirely useless?

void foo () {
    Base b;
    void * vp = &b;     // (1) Defined behaviour?
    cout << vp << endl; // (2) I hope this isn't a 'trap value'
    cout << &b << endl; // (3a) Prints the same as the last line?
                        // (3b) It has the 'same value' in some sense?
    Derived *dp = (Derived*)(vp);
                        // (4) Maybe this is an 'indeterminate value',
                        // but not fully UB?
    cout << dp << endl; // (5)  Defined behaviour also?  Should print the same value as &b

Edit: If the program ended here, would it be UB? Note that, at this stage, I have not attempted to do anything with dp, other than print the pointer itself to the output. If simply casting is UB, then I guess the question ends here.

                        // I hope the dp pointer still has a value,
                        // even if we can't dereference it
    if(dp == &b) {      // (6) True?
            cout << "They have the same value. (Whatever that means!)" << endl;
    }

    cout << &(b.second) << endl; (7) this is definitely OK
    cout << &(dp->second) << endl; // (8)  Just taking the address. Is this OK?
    if( &(dp->second) == &(b.second) ) {      // (9) True?
            cout << "The members are stored in the same place?" << endl;
    }
}

I'm slightly nervous about (4) above. But I assume that it's always safe to cast to and from void pointers. Maybe the value of such a pointer can be discussed. But, is it defined to do the cast, and to print the pointer to cout?

(6) is important also. Will this evaluate to true?

In (8), we have the first time this pointer is being dereferenced (correct term?). But note that this line doesn't read from dp->second. It's still just an lvalue and we take its address. This calculation of the address is, I assume, defined by simple pointer arithmetic rules that we have from the C language?

If all of the above is OK, maybe we can prove that static_cast<Derived&>(b) is OK, and will lead to a perfectly usable object.

like image 834
Aaron McDaid Avatar asked Nov 02 '13 12:11

Aaron McDaid


People also ask

Can a pointer of a base class point to a derived class?

The pointer of Base Class pointing different object of derived class: A derived class is a class which takes some properties from its base class. It is true that a pointer of one class can point to other class, but classes must be a base and derived class, then it is possible.

Why can't pbase and Rbase see derived objects?

It turns out that because rBase and pBase are a Base reference and pointer, they can only see members of Base (or any classes that Base inherited). So even though Derived::getName () shadows (hides) Base::getName () for Derived objects, the Base pointer/reference can not see Derived::getName ().

What are the parts of a derived class?

In the chapter on construction of derived classes, you learned that when you create a derived class, it is composed of multiple parts: one part for each inherited class, and a part for itself. For example, here’s a simple case:


1 Answers

  1. Casting from a data pointer to void * is always guaranteed to work, and the pointer is guaranteed to survive the roundtrip Base * -> void * -> Base * (C++11 §5.2.9 ¶13);
  2. vp is a valid pointer, so there shouldn't be any problem.
  3. a. albeit printing pointers is implementation-defined1, the printed values should be the same: in facts operator<< by default is overloaded only for const void *, so when you write cout<<&b you are converting to const void * anyway, i.e. what operator<< sees is in both cases &b casted to const void *.

    b. yes, if we take the only sensible definition of "has the same value" - i.e. it compares equal with the == operator; in facts, if you compare vp and &b with ==, the result is true, both if you convert vp to Base * (due to what we said in 1), and if you convert &b to void *.

    Both these conclusions follow from §4.10 ¶2, where it's specified that any pointer can be converted to void * (modulo the usual cv-qualified stuff), and the result «points to the start of the storage location where the object [...] resides»1

  4. This is tricky; that C-style cast is equivalent to a static_cast, which will happily allow casting a «"pointer to cv1 B[...] to [...] "pointer to *cv2 D", where D is a class derived from B» (§5.2.9, ¶11; there are some additional constraints, but they are satisfied here); but:

    If the prvalue of type “pointer to cv1 B” points to a B that is actually a subobject of an object of type D, the resulting pointer points to the enclosing object of type D. Otherwise, the result of the cast is undefined.

    (emphasis added)

    So, here your cast is allowed, but the result is undefined...

  5. ... which leads us to printing its value; since the result of the cast is undefined, you may get anything. Since pointers are probably allowed to have trap representations (at least in C99, I could find only sparse references to "traps" in the C++11 standard, but I think that probably this behavior should already be inherited from C89) you may even get a crash just by reading this pointer to print it via operator<<.

If follows that also 6, 8 and 9 aren't meaningful, because you are using an undefined result.

Also, even if the cast was valid, strict aliasing (§3.10, ¶10) would block you to do anything meaningful with the pointed objects, since aliasing a Base object via a Derived pointer is only allowed when the dynamic type of the Base object is actually Derived; anything that deviates from the exceptions specified at §3.10 ¶10 results in undefined behavior.


Notes:

  1. operator>> delegates to num_put which conceptually delegates to printf with %p, whose description boils down to "implementation defined".

  2. This rules out my fear that an evil implementation could in theory return different but equivalent values when casting to void *.

like image 137
Matteo Italia Avatar answered Sep 26 '22 01:09

Matteo Italia