Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How undefined behavior of strict aliasing could happen if data is only read?

I'am verifying strict aliasing in C with the following code:

#include <stdio.h>

int main() {
    int x = 42;
    float *pf = (float *)&x;  // strict aliasing violation
    
    //*pf = 3.14;

    printf("x = %d\n", x);    // undefined behaviour
    return 0;
}

and by testing it on x64 architecture on many devices, I have never seen "error" in printf less when I remove the comment ( //*pf = 3.14;). Obv this is the meaning of ub (undefined behavior) but I can't explain how error could occur if a data is only read (I'm not saying it will happen for sure), even if there are 2 pointers.

like image 565
Nicola Sergio Avatar asked Oct 29 '25 18:10

Nicola Sergio


2 Answers

Consider this function:

void foo(int *i, float *f)
{
    printf("%d %f\n", *i, *f);
}

The compiler is allowed by the C standard to generate code as if this function were:

void foo(int *i, float *f)
{
    if ((void *) i == (void *) f)
    {
        fprintf(stderr, "Error, lvalues will alias.\n");
        abort();
    }
    printf("%d %f\n", *i, *f);
}

It would also be allowed to generate code as if the function had any other source code inside the if.

I do not know of any compiler that does so other than for debugging features to find aliasing violations.

Other than that, if we consider this situation:

  • A function receives pointers that might or might not alias with disallowed types.
  • The function does not alter the aliased memory by any means.
  • Only that function, and declarations necessary for it, are visible to the compiler, not any other part of the program’s source code.
  • The compiler does not generate code deliberately designed to behave differently for aliased memory.

and ask this question: In this situation, is there any reason the compiler would behave differently for the C standard as it stands than if the C standard said aliasing was allowed? I do not know of such a reason.

In contrast, when memory is modified, there are reasons the behavior is altered with aliasing. If the function contained:

    printf("%f\n", *f);
    *i = 3;
    printf("%d %f\n", *i, *f);

then, with the C standard as it stands, the compiler can load *f once and pass the loaded value to printf twice, because the compiler may assume *i = 3; does not change *f. This is good for optimization; it allows the compiler to eliminate some operations and to order operations in efficient ways. If the C standard were changed so this were not an aliasing violation, then the compiler cannot make that assumption and must load *f a second time after *i = 3;.

like image 140
Eric Postpischil Avatar answered Nov 01 '25 10:11

Eric Postpischil


How undefined behavior of strict aliasing could happen if data is only read?

Strict aliasing violations happens when data is accessed using a different type than the so-called "effective type" = the type that the compiler believes is sitting at that memory location. A strict aliasing violation could be a read access or a write access. In your example:

  • *pf = 3.14; This is a strict aliasing violation. (Write)
  • float f = *pf; This is also a strict aliasing violation. (Read)

float *pf = (float *)&x; // strict aliasing violation

No, that line in itself is not a strict aliasing violation. It could be UB for other reasons. For example if having a 16 bit system with alignment and 16 bit int. Then the 32 bit float could be misaligned, in that case leading to UB on this very line.


//*pf = 3.14;

If this line is commented out, your code has no UB. If it is there, then the code invokes UB on that line.


More info about strict aliasing and pointer conversions: How does the strict aliasing rule enable or prevent compiler optimizations?

like image 39
Lundin Avatar answered Nov 01 '25 08:11

Lundin



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!