Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Strict aliasing within an expression

Suppose we have the following code:

#include <stdio.h>
#include <stdint.h>

int main()
{
    uint16_t a[] = { 1, 2, 3, 4 };
    const size_t n = sizeof(a) / sizeof(uint16_t);
    
    for (size_t i = 0; i < n; i++) {
        uint16_t *b = (uint16_t *) ((uint8_t *) a + i * sizeof(uint16_t));
        printf("%u\n", *b);
    }
    
    return 0;
}

Clearly, casting a to an uint8_t pointer is not a violation, so this question is about casting that resulting pointer to an uint16_t pointer. In my understanding, according to the standard it does violate the strict aliasing rule. However, I am not sure from a practical point of view, since the types of a and b are compatible. The only potential violation is b aliasing the uint8_t pointer that exists only throughout this one expression. So in my understanding, even if it violates the rule, I would doubt that it can cause undefined behavior. Can it?

Note that I am not saying that this code is meaningful. The question is meant for purely educational purposes regarding the understanding of strict aliasing.

like image 371
Pedro Avatar asked Mar 08 '21 15:03

Pedro


People also ask

What is the strict aliasing rule?

"Strict aliasing is an assumption, made by the C (or C++) compiler, that dereferencing pointers to objects of different types will never refer to the same memory location (i.e. alias each other.)"

Does C++ have strict aliasing?

In both C and C++ the standard specifies which expression types are allowed to alias which types. The compiler and optimizer are allowed to assume we follow the aliasing rules strictly, hence the term strict aliasing rule.

What is aliasing in programming language?

In computing, aliasing describes a situation in which a data location in memory can be accessed through different symbolic names in the program. Thus, modifying the data through one name implicitly modifies the values associated with all aliased names, which may not be expected by the programmer.

What is the problem of aliasing while using pointers?

Two seemingly different pointers may point to storage locations in the same array (aliasing). As a result, data dependencies can arise when performing loop-based computations using pointers, as the pointers may potentially point to overlapping regions in memory.

What is the strict aliasing rule in C?

The strict aliasing rule makes this setup illegal: dereferencing a pointer that aliases an object that is not of a compatible type or one of the other types allowed by C 2011 6.5 paragraph 7 1 is undefined behavior.

Is the presumption of strict aliasing still true?

The presumption of strict aliasing remains true: Two pointers of different types are assumed, except in a few very limited conditions specified in the C99 standard, not to alias. This is not one of those exceptions. The above source when compiled with GCC 3.4.1 or GCC 4.0 with the -Wstrict-aliasing=2 flag enabled will NOT generate a warning.

What is aliasing in C programming?

Aliasing: Aliasing refers to the situation where the same memory location can be accessed using different names. For Example, if a function takes two pointers A and B which have the same value, then the name A [0] aliases the name B [0] i.e., we say the pointers A and B alias each other. Below is the program to illustrate aliasing in C:

Is it possible to use restrict keyword without strict aliasing?

It is unlikely that code that does not enable strict aliasing would be able to take advantage of the restrict keyword. Using the restrict keyword allows a significant class of memory access optimizations critical to high performance code.


1 Answers

This is not a strict aliasing violation.

The conversion of a to uint8_t and the subsequent pointer arithmetic is safe due to the exception given to a conversion to a pointer-to-character type.

Section 6.3.2.3p7 of the C standard states:

A pointer to an object type may be converted to a pointer to a different object type. If the resulting pointer is not correctly aligned68)for the referenced type, the behavior is undefined. Otherwise, when converted back again, the result shall compare equal to theo riginal pointer. When a pointer to an object is converted to a pointer to a character type, the result points to the lowest addressed byte of the object. Successive increments of the result, up to the size of the object, yield pointers to the remaining bytes of the object.

The conversion back and subsequent dereference is safe because b is pointing to an object of type uint16_t (specifically a member of the array a), matching the pointed-to type of b.

Section 6.5p7 states:

An object shall have its stored value accessed only by an lvalue expression that has one of the following types:

  • a type compatible with the effective type of the object,
  • a qualified version of a type compatible with the effective type of the object,
  • a type that is the signed or unsigned type corresponding to the effective type of the object,
  • a type that is the signed or unsigned type corresponding to a qualified version of the effective type of the object,
  • an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union), or
  • a character type.
like image 167
dbush Avatar answered Oct 12 '22 10:10

dbush