Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can you access the object representation of any object through a char*?

I have stumbled upon a reddit thread in which a user has found an interesting detail of the C++ standard. The thread has not spawned much constructive discussion, therefore I will retell my understanding of the problem here:

  • OP wants to reimplement memcpy in a standard-compliant way
  • They attempt to do so by using reinterpret_cast<char*>(&foo), which is an allowed exception to the strict aliasing restrictions, in which reinterpreting as char is allowed to access the "object representation" of an object.
  • [expr.reinterpret.cast] says that doing so results in static_­cast<cv T*>(static_­cast<cv void*>(v)), so reinterpret_cast in this case is equivalent to static_cast'ing first to void * and then to char *.
  • [expr.static.cast] in combination with [basic.compound]

A prvalue of type “pointer to cv1 void” can be converted to a prvalue of type “pointer to cv2 T”, where T is an object type and cv2 is the same cv-qualification as, or greater cv-qualification than, cv1. [...] if the original pointer value points to an object a, and there is an object b of type T (ignoring cv-qualification) that is pointer-interconvertible with a, the result is a pointer to b. [...] [emphasis mine]

Consider now the following union class:

union Foo{
    char c;
    int i;
};
// the OP has used union, but iiuc,
// it can also be a struct for the problem to arise.

OP has thus come to the conclusion that reinterpreting a Foo* as char* in this case yields a pointer pointing to the first char member of the union (or its object representation), rather than to the object representation of the union itself, i.e. it points only to the member. While this appears superficially to be the same, and corresponds to the same memory address, the standard seems to differentiate between the "value" of a pointer and its corresponding address, in that on the abstract C++ machine, a pointer belongs to a certain object only. Incrementing it beyond that object (compare with end() of an array) is undefined behavior.

OP thus argues that if the standard forces the char* to be associated with the objects's first member instead of the object representation of the whole union object, dereferencing it after one incrementation is UB, which allows a compiler to optimize as if it were impossible for the resultant char* to ever access the following bytes of the int member. This implies that it is not possible to legally access the complete object representation of a class object which is pointer-interconvertible with a char member.

The same would, if I understand correctly apply if "union" was simply replaced with "struct", but I have taken this example from the original thread.

What do you think? Is this a standard defect? Is it a misinterpretation?

like image 721
JMC Avatar asked Sep 03 '20 12:09

JMC


1 Answers

This video, linked in the comments (now chat) by @KonradRudolph is likely the answer to the problem.

At around the 40min mark, Timur Doumler, who is a member of the ISO C++ commitee, discusses the possibility of accessing byte representations. The summary is that any attempt of accessing byte representation except memcpy is UB. The situation in the OP does not even arise without making use of UB because the very act of using a pointer to an object like an array, or doing any pointer arithmetic on it is UB, as these operations are only well-defined when dealing with actual array objects, as far as the abstract machine is concerned.

Also, while reinterpreting a pointer as a char* does not on its own violate aliasing rules, there is technically no guarantee that the resulting char* will point to the first byte of the object.

The only legal way of accessing byte representations is to memcpy the object into a char array. This means that reimplementing memcpy is impossible.

Timur Doumler additionally describes this as a wording defect that will hopefully be fixed in C++23 and presents a paper that proposes a fix to this.

like image 151
JMC Avatar answered Nov 01 '22 19:11

JMC