Questions:
Does this code below violate strict aliasing rules? That is, would a smart compiler be allowed to print 00000
(or some other nasty effect), because a buffer first accessed as other type is then accessed via int*
?
If not, would moving just the definition and initializaton of ptr2
before the braces (so ptr2
would be defined already, when ptr1
comes to scope) break it?
If not, would removing the braces (so ptr1
and ptr2
were in the same scope) break it?
If yes, how could the code be fixed?
Bonus question: If the code is ok, and 2. or 3. don't break it either, how to change it so it would break strict aliasing rules (example, convert braced loop to use int16_t
)?
int i;
void *buf = calloc(5, sizeof(int)); // buf initialized to 0
{
char *ptr1 = buf;
for(i = 0; i < 5*sizeof(int); ++i)
ptr1[i] = i;
}
int *ptr2 = buf;
for(i = 0; i < 5; ++i)
printf("%d", ptr2[i]);
Looking for confirmation, so short(ish), expert answer about this particular code, ideally with minimal standard quotes, is what I am after. I am not after long explanations of strict aliasing rules, only the parts that pertain to this code. And it would be great if an answer would explicitly enumerate the numbered questions above.
Also assume a general-purpose CPU with no integer trap values, and let's also say int
is 32 bits and two's complement.
The compiler and optimizer are allowed to assume we follow the aliasing rules strictly, hence the term strict aliasing rule. If we attempt to access a value using a type not allowed it is classified as undefined behavior(UB).
In C, C++, and some other programming languages, the term aliasing refers to a situation where two different expressions or symbols refer to the same object.
Thus, modifying the data through one name implicitly modifies the values associated with all aliased names, which may not be expected by the programmer. As a result, aliasing makes it particularly difficult to understand, analyze and optimize programs.
Aliasing: Aliasing refers to the situation where the same memory location can be accessed using different names. For Example, if a function takes two pointers A and B which have the same value, then the name A[0] aliases the name B[0] i.e., we say the pointers A and B alias each other.
No it doesn't, but this is only because the memory was allocated, and written into using a character type.
Memory is allocated using malloc. That object doesn't have declared1 type because it was allocated with malloc. Thus the object doesn't have any effective type.
Then the code accesses and modifies the object using the type char
. As the type is2char
and no object having an effective type is copied5, copying doesn't set the effective type to char
for this and subsequent accesses, but sets the effective type to char
, only for the duration of the access3. After the access, the object doesn't have an effective type anymore.
Then the type int
is used to access and only read that object. As the object doesn't have an effective type, it becomes3int
, for the duration of the read. After the access the object doesn't have an effective type anymore. As int
was obviously compatible with the effective type int
, the behavior is defined.
(Assuming the values read are not trap representation for int
.)
Had you accessed and modified the object using a non-character type that is also not compatible with int
, the behavior would be undefined.
Let's say your example was (assuming sizeof(float)==sizeof(int)
):
int i;
void *buf = calloc(5, sizeof(float)); // buf initialized to 0
{
float *ptr1 = buf;
for(i = 0; i < 5*sizeof(float); ++i)
ptr1[i] = (float)i;
}
int *ptr2 = buf;
for(i = 0; i < 5; ++i)
printf("%d", ptr2[i]);
The effective type of the object, when float
s are being written into, becomes of type float
, for the duration of the write and all subsequent accesses to the object that don't modify it2. When those objects are then accessed by int
the effective type remains float
, as the values are only being read not modified. The previous write using float
set the effective type to float
permanently until the next write into this object (which didn't happen in this case). Types int
and float
are not compatible4, thus the behavior is undefined.
(All text below is quoted from: ISO:IEC 9899:201x)
1 (6.5 Expressions 6)
The effective type of an object for an access to its stored value is the declared type of the object, if any. 87) Allocated objects have no declared type.
2 (6.5 Expressions 6)
If a value is stored into an object having no declared type through an lvalue having a type that is not a character type, then the type of the lvalue becomes the effective type of the object for that access and for subsequent accesses that do not modify the stored value.
3 (6.5 Expressions 6)
For all other accesses to an object having no declared type, the effective type of the object is simply the type of the lvalue used for the access.
4 (6.5 Expressions 8)
An object shall have its stored value accessed only by an lvalue expression that has one of
the following types: 88)
— a type compatible with the effective type of the object,
— a qualified version of a type compatible with the effective type of the object,
— a type that is the signed or unsigned type corresponding to the effective type of the
object,
— a type that is the signed or unsigned type corresponding to a qualified version of the
effective type of the object,
— an aggregate or union type that includes one of the aforementioned types among its
members (including, recursively, a member of a subaggregate or contained union), or
— a character type.
5 (6.5 Expressions 6)
If a value is copied into an object having no declared type using memcpy or memmove, or is copied as an array of character type, then the effective type of the modified object for that access and for subsequent accesses that do not modify the value is the effective type of the object from which the value is copied, if it has one.
No. This does not violate strict aliasing.
From the C Standard, 6.2.5 Types, paragraph 28:
A pointer to
void
shall have the same representation and alignment requirements as a pointer to a character type. 48
Note the 48. That refers to footnote 48:
48) The same representation and alignment requirements are meant to imply interchangeability as arguments to functions, return values from functions, and members of unions.
So you can access the calloc()
'd memory via a char *
pointer (assuming your ptr
is meant to be ptr1
) with no problems.
Although that's really extra, since 7.22.3 Memory management functions, paragraph 1 states:
The pointer returned if the allocation succeeds is suitably aligned so that it may be assigned to a pointer to any type of object with a fundamental alignment requirement and then used to access such an object or an array of such objects in the space allocated
So you can safely access the calloc()
'd memory via an int
pointer too, as well as a char
pointer. And a double
pointer to boot (assuming you stay within the bounds of the allocated memory).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With