Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Strict aliasing and overlay inheritance

Consider this code example:

#include <stdio.h>

typedef struct A A;

struct A {
   int x;
   int y;
};

typedef struct B B;

struct B {
   int x;
   int y;
   int z;
};

int main()
{
    B b = {1,2,3};
    A *ap = (A*)&b;

    *ap = (A){100,200};      //a clear http://port70.net/~nsz/c/c11/n1570.html#6.5p7 violation

    ap->x = 10;  ap->y = 20; //lvalues of types int and int at the right addrresses, ergo correct ?

    printf("%d %d %d\n", b.x, b.y, b.z);
}

I used to think that something like casting B* to A* and using A* to manipulate the B* object was a strict aliasing violation. But then I realized the standard really only requires that:

An object shall have its stored value accessed only by an lvalue expression that has one of the following types: 1) a type compatible with the effective type of the object, (...)

and expressions such as ap->x do have the correct type and address, and the type of ap shouldn't really matter there (or does it?). This would, in my mind, imply that this type of overlay inheritance is correct as long as the substructure isn't manipulated as a whole.

Is this interpretation flawed or ostensibly at odds with what the authors of the standard intended?

like image 713
PSkocik Avatar asked Feb 20 '17 19:02

PSkocik


People also ask

What is the strict aliasing rule?

The strict aliasing rule dictates that pointers are assumed not to alias if they point to fundamentally different types, except for char* and void* which can alias to any other data type.

Does c++ have strict aliasing?

In both C and C++ the standard specifies which expression types are allowed to alias which types. The compiler and optimizer are allowed to assume we follow the aliasing rules strictly, hence the term strict aliasing rule.


1 Answers

The line with *ap = is a strict aliasing violation: an object of type B is written using an lvalue expression of type A.

Supposing that line was not present, and we moved onto ap->x = 10; ap->y = 20;. In this case an lvalue of type int is used to write objects of type int.

There is disagreement about whether this is a strict aliasing violation or not. I think that the letter of the Standard says that it is not, but others (including gcc and clang developers) consider ap->x as implying that *ap was accessed. Most agree that the standard's definition of strict aliasing is too vague and needs improvement.

Sample code using your struct definitions:

void f(A* ap, B* bp)
{
  ap->x = 213;
  ++bp->x;
  ap->x = 213;
  ++bp->x;
}

int main()
{
   B b = { 0 };
   f( (A *)&b, &b );
   printf("%d\n", b.x);
}

For me this outputs 214 at -O2, and 2 at -O3 , with gcc. The generated assembly on godbolt for gcc 6.3 was:

f:
    movl    (%rsi), %eax
    movl    $213, (%rdi)
    addl    $2, %eax
    movl    %eax, (%rsi)
    ret

which shows that the compiler has rearranged the function to:

int temp = bp->x + 2;
ap->x = 213;
bp->x = temp;

and therefore the compiler must be considering that ap->x may not alias bp->x.

like image 140
M.M Avatar answered Oct 15 '22 09:10

M.M