Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C aliasing rules and memcpy

While answering another question, I thought of the following example:

void *p;
unsigned x = 17;

assert(sizeof(void*) >= sizeof(unsigned));
*(unsigned*)&p = 17;        // (1)
memcpy(&p, &x, sizeof(x));  // (2)

Line 1 breaks aliasing rules. Line 2, however, is OK wrt. aliasing rules. The question is: why? Does the compiler have special built-in knowledge about functions such as memcpy, or are there some other rules that make memcpy OK? Is there a way of implementing memcpy-like functions in standard C without breaking the aliasing rules?

like image 986
zvrba Avatar asked Jul 18 '10 11:07

zvrba


1 Answers

The C Standard is quite clear on it. The effective type of the object named by p is void*, because it has a declared type, see 6.5/6. The aliasing rules in C99 apply to reads and writes, and the write to void* through an unsigned lvalue in (1) is undefined behavior according to 6.5/7.

In contrast, the memcpy of (2) is fine, because unsigned char* can alias any object (6.5/7). The Standard defines memcpy at 7.21.2/1 as

For all functions in this subclause, each character shall be interpreted as if it had the type unsigned char (and therefore every possible object representation is valid and has a different value).

The memcpy function copies n characters from the object pointed to by s2 into the object pointed to by s1. If copying takes place between objects that overlap, the behavior is undefined.

However if there exist a use of p afterwards, that might cause undefined behavior depending on the bitpattern. If such a use does not happen, that code is fine in C.


According to the C++ Standard, which in my opinion is far from clear on the issue, i think the following holds. Please don't take this interpretation as the only possible - the vague/incomplete specification leaves a lot of room for speculation.

Line (1) is problematic because the alignment of &p might not be ok for the unsigned type. It changes the type of the object stored in p to be unsigned int. As long as you don't access that object later on through p, aliasing rules are not broken, but alignment requirements might still be.

Line (2) however has no alignment problems, and is thus valid, as long as you don't access p afterwards as a void*, which might cause undefined behavior depending on how the void* type interprets the stored bitpattern. I don't think that the type of the object is changed thereby.

There is a long GCC Bugreport that also discusses the implications of a write through a pointer that resulted from such a cast and what the difference to placement-new is (people on that list aren't agreeing what it is).

like image 70
Johannes Schaub - litb Avatar answered Oct 27 '22 23:10

Johannes Schaub - litb