Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Does this code violate the strict aliasing rule?

Questions:

  1. Does this code below violate strict aliasing rules? That is, would a smart compiler be allowed to print 00000 (or some other nasty effect), because a buffer first accessed as other type is then accessed via int*?

  2. If not, would moving just the definition and initializaton of ptr2 before the braces (so ptr2 would be defined already, when ptr1 comes to scope) break it?

  3. If not, would removing the braces (so ptr1 and ptr2 were in the same scope) break it?

  4. If yes, how could the code be fixed?

Bonus question: If the code is ok, and 2. or 3. don't break it either, how to change it so it would break strict aliasing rules (example, convert braced loop to use int16_t)?


int i;
void *buf = calloc(5, sizeof(int)); // buf initialized to 0

{
    char *ptr1 = buf;    
    for(i = 0; i < 5*sizeof(int); ++i)
        ptr1[i] = i;
}

int *ptr2 = buf;
for(i = 0; i < 5; ++i)
    printf("%d", ptr2[i]);

Looking for confirmation, so short(ish), expert answer about this particular code, ideally with minimal standard quotes, is what I am after. I am not after long explanations of strict aliasing rules, only the parts that pertain to this code. And it would be great if an answer would explicitly enumerate the numbered questions above.

Also assume a general-purpose CPU with no integer trap values, and let's also say int is 32 bits and two's complement.

like image 349
hyde Avatar asked Jul 13 '16 09:07

hyde


People also ask

What is the strict aliasing rule and why do we care?

The compiler and optimizer are allowed to assume we follow the aliasing rules strictly, hence the term strict aliasing rule. If we attempt to access a value using a type not allowed it is classified as undefined behavior(UB).

What is C++ aliasing?

In C, C++, and some other programming languages, the term aliasing refers to a situation where two different expressions or symbols refer to the same object.

Why is aliasing bad in programming?

Thus, modifying the data through one name implicitly modifies the values associated with all aliased names, which may not be expected by the programmer. As a result, aliasing makes it particularly difficult to understand, analyze and optimize programs.

What is aliasing give an example?

Aliasing: Aliasing refers to the situation where the same memory location can be accessed using different names. For Example, if a function takes two pointers A and B which have the same value, then the name A[0] aliases the name B[0] i.e., we say the pointers A and B alias each other.


2 Answers

No it doesn't, but this is only because the memory was allocated, and written into using a character type.

Memory is allocated using malloc. That object doesn't have declared1 type because it was allocated with malloc. Thus the object doesn't have any effective type.

Then the code accesses and modifies the object using the type char. As the type is2char and no object having an effective type is copied5, copying doesn't set the effective type to char for this and subsequent accesses, but sets the effective type to char, only for the duration of the access3. After the access, the object doesn't have an effective type anymore.

Then the type int is used to access and only read that object. As the object doesn't have an effective type, it becomes3int, for the duration of the read. After the access the object doesn't have an effective type anymore. As int was obviously compatible with the effective type int, the behavior is defined.

(Assuming the values read are not trap representation for int.)


Had you accessed and modified the object using a non-character type that is also not compatible with int, the behavior would be undefined.

Let's say your example was (assuming sizeof(float)==sizeof(int)):

int i;
void *buf = calloc(5, sizeof(float)); // buf initialized to 0

{
    float *ptr1 = buf;    
    for(i = 0; i < 5*sizeof(float); ++i)
        ptr1[i] = (float)i;
}

int *ptr2 = buf;
for(i = 0; i < 5; ++i)
    printf("%d", ptr2[i]);

The effective type of the object, when floats are being written into, becomes of type float, for the duration of the write and all subsequent accesses to the object that don't modify it2. When those objects are then accessed by int the effective type remains float, as the values are only being read not modified. The previous write using float set the effective type to float permanently until the next write into this object (which didn't happen in this case). Types int and float are not compatible4, thus the behavior is undefined.


(All text below is quoted from: ISO:IEC 9899:201x)

1 (6.5 Expressions 6)
The effective type of an object for an access to its stored value is the declared type of the object, if any. 87) Allocated objects have no declared type.

2 (6.5 Expressions 6)
If a value is stored into an object having no declared type through an lvalue having a type that is not a character type, then the type of the lvalue becomes the effective type of the object for that access and for subsequent accesses that do not modify the stored value.

3 (6.5 Expressions 6)
For all other accesses to an object having no declared type, the effective type of the object is simply the type of the lvalue used for the access.

4 (6.5 Expressions 8)
An object shall have its stored value accessed only by an lvalue expression that has one of the following types: 88) — a type compatible with the effective type of the object, — a qualified version of a type compatible with the effective type of the object, — a type that is the signed or unsigned type corresponding to the effective type of the object, — a type that is the signed or unsigned type corresponding to a qualified version of the effective type of the object, — an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union), or — a character type.

5 (6.5 Expressions 6)
If a value is copied into an object having no declared type using memcpy or memmove, or is copied as an array of character type, then the effective type of the modified object for that access and for subsequent accesses that do not modify the value is the effective type of the object from which the value is copied, if it has one.

like image 79
2501 Avatar answered Nov 16 '22 02:11

2501


No. This does not violate strict aliasing.

From the C Standard, 6.2.5 Types, paragraph 28:

A pointer to void shall have the same representation and alignment requirements as a pointer to a character type. 48

Note the 48. That refers to footnote 48:

48) The same representation and alignment requirements are meant to imply interchangeability as arguments to functions, return values from functions, and members of unions.

So you can access the calloc()'d memory via a char * pointer (assuming your ptr is meant to be ptr1) with no problems.

Although that's really extra, since 7.22.3 Memory management functions, paragraph 1 states:

The pointer returned if the allocation succeeds is suitably aligned so that it may be assigned to a pointer to any type of object with a fundamental alignment requirement and then used to access such an object or an array of such objects in the space allocated

So you can safely access the calloc()'d memory via an int pointer too, as well as a char pointer. And a double pointer to boot (assuming you stay within the bounds of the allocated memory).

like image 36
Andrew Henle Avatar answered Nov 16 '22 02:11

Andrew Henle