Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Creating an invalid reference via reinterpret cast

I am trying to determine whether the following code invokes undefined behavior:

#include <iostream>

class A;

void f(A& f)
{
  char* x = reinterpret_cast<char*>(&f);
  for (int i = 0; i < 5; ++i)
    std::cout << x[i];
}

int main(int argc, char** argue)
{
  A* a = reinterpret_cast<A*>(new char[5])
  f(*a);
}

My understanding is that reinterpret_casts to and from char* are compliant because the standard permits aliasing with char and unsigned char pointers (emphasis mine):

If a program attempts to access the stored value of an object through an lvalue of other than one of the following types the behavior is undefined:

  • the dynamic type of the object,
  • a cv-qualified version of the dynamic type of the object,
  • a type that is the signed or unsigned type corresponding to the dynamic type of the object,
  • a type that is the signed or unsigned type corresponding to a cv-qualified version of the dynamic type of the object,
  • an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union),
  • a type that is a (possibly cv-qualified) base class type of the dynamic type of the object,
  • a char or unsigned char type.

However, I am not sure whether f(*a) invokes undefined behavior by creating a A& reference to the invalid pointer. The deciding factor seems to be what "attempts to access" verbiage means in the context of the C++ standard.

My intuition is that this does not constitute an access, since an access would require A to be defined (it is declared, but not defined in this example). Unfortunately, I cannot find a concrete definition of "access" in the C++ standard:

Does f(*a) invoke undefined behavior? What constitutes "access" in the C++ standard?

I understand that, regardless of the answer, it is likely a bad idea to rely on this behavior in production code. I am asking this question primarily out of a desire to improve my understanding of the language.

[Edit] @SergeyA cited this section of the standard. I've included it here for easy reference (emphasis mine):

5.3.1/1 [expr.unary.op]

The unary * operator performs indirection: the expression to which it is applied shall be a pointer to an object type, or a pointer to a function type and the result is an lvalue referring to the object or function to which the expression points. If the type of the expression is “pointer to T,” the type of the result is “T.” [Note: indirection through a pointer to an incomplete type (other than cv void) is valid. The lvalue thus obtained can be used in limited ways (to initialize a reference, for example); this lvalue must not be converted to a prvalue, see 4.1. — end note ]

Tracing the reference to 4.1, we find:

4.1/1 [conv.lval]

A glvalue (3.10) of a non-function, non-array type T can be converted to a prvalue. If T is an incomplete type, a program that necessitates this conversion is ill-formed. If T is a non-class type, the type of the prvalue is the cv-unqualified version of T. Otherwise, the type of the prvalue is T.

When an lvalue-to-rvalue conversion is applied to an expression e, and either:

  • e is not potentially evaluated, or
  • the evaluation of e results in the evaluation of a member ex of the set of potential results of e, and ex names a variable x that is not odr-used by ex (3.2)

the value contained in the referenced object is not accessed.

I think our answer lies in whether *a satisfies the second bullet point. I am having trouble parsing that condition, so I am not sure.

like image 233
Michael Koval Avatar asked Apr 06 '16 17:04

Michael Koval


People also ask

What is reinterpret_ cast in c++?

reinterpret_cast is a type of casting operator used in C++. It is used to convert a pointer of some data type into a pointer of another data type, even if the data types before and after conversion are different. It does not check if the pointer type and data pointed by the pointer is same or not.

Is reinterpret_ cast compile-time?

It is purely a compile-time directive which instructs the compiler to treat expression as if it had the type new-type. Only the following conversions can be done with reinterpret_cast, except when such conversions would cast away constness or volatility.

Is Reinterpret_cast safe?

The result of a reinterpret_cast cannot safely be used for anything other than being cast back to its original type. Other uses are, at best, nonportable. The reinterpret_cast operator cannot cast away the const , volatile , or __unaligned attributes.

Can Reinterpret_cast throw?

No. It is a purely compile-time construct. It is very dangerous, because it lets you get away with very wrong conversions.

How do you use reinterpret cast in C++?

reinterpret_cast in C++ | Type Casting operators. reinterpret_cast is a type of casting operator used in C++. It is used to convert one pointer of another pointer of any type, no matter either the class is related to each other or not. It does not check if the pointer type and data pointed by the pointer is same or not.

What is the use of reinterpret_cast operator?

The reinterpret_cast operator can be used for conversions such as char* to int*, or One_class* to Unrelated_class*, which are inherently unsafe. The result of a reinterpret_cast cannot safely be used for anything other than being cast back to its original type.

What is the result of a cast expression in C++?

As with all cast expressions, the result is: an lvalue if new_type is an lvalue reference type or an rvalue reference to function type; a prvalue otherwise.

Is cast from a pointer to an object type well defined?

If the implementation provides std::intptr_t and/or std::uintptr_t, then a cast from a pointer to an object type or cv void to these types is always well-defined. However, this is not guaranteed for a function pointer. Demonstrates some uses of reinterpret_cast:


1 Answers

char* x = reinterpret_cast<char*>(&f); is valid. Or, more specifically, access through x is allowed - the cast itself is always valid.

A* a = reinterpret_cast<A*>(new char[5]) is not valid - or, to be precise, access through a will trigger undefined behaviour.

The reason for this is that while it's OK to access object through a char*, it's not OK to access array of chars through a random object. Standard allows first, but not the second.

Or, in layman terms, you can alias a type* through char*, but you can't alias char* through type*.

EDIT

I just noticed I didn't answer direct question ("What constitutes "access" in the C++ standard"). Apparently, Standard does not define access (at least, I was not able to find the formal definition), but dereferencing the pointer is commonly understood to qualify for access.

like image 78
SergeyA Avatar answered Sep 19 '22 09:09

SergeyA