Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Type punning a struct in C and C++ via a union

I've compiled this in gcc and g++ with pedantic and I don't get a warning in either one:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

struct a {
    struct a *next;
    int i;
};

struct b {
    struct b *next;
    int i;
};

struct c {
    int x, x2, x3;
    union {
        struct a a;
        struct b b;
    } u;
};

void foo(struct b *bar) {
    bar->next->i = 9;
    return;
}

int main(int argc, char *argv[]) {
    struct c c;
    memset(&c, 0, sizeof c);
    c.u.a.next = (struct a *)calloc(1, sizeof(struct a));
    foo(&c.u.b);
    printf("%d\n", c.u.a.next->i);
    return 0;
}

Is this legal to do in C and C++? I've read about the type-punning but I don't understand. Is foo(&c.u.b) any different from foo((struct b *)&c.u.a)? Wouldn't they be exactly the same? This exception for structs in a union (from C89 in 3.3.2.3) says:

If a union contains several structures that share a common initial sequence, and if the union object currently contains one of these structures, it is permitted to inspect the common initial part of any of them. Two structures share a common initial sequence if corresponding members have compatible types for a sequence of one or more initial members.

In the union the first member of struct a is struct a *next, and the first member of struct b is struct b *next. As you can see a pointer to struct a *next is written, and then in foo a pointer to struct b *next is read. Are they compatible types? They're both pointers to a struct and pointers to any struct should be the same size, so they should be compatible and the layout should be the same right? Is it ok to read i from one struct and write to the other? Am I committing any type of aliasing or type-punning violation?

like image 399
loop Avatar asked Feb 14 '15 22:02

loop


2 Answers

In C:

struct a and struct b are not compatible types. Even in

typedef struct s1 { int x; } t1, *tp1;
typedef struct s2 { int x; } t2, *tp2;

s1 and s2 are not compatible types. (See example in 6.7.8/p5.) An easy way to identify non-compatible structs is that if two struct types are compatible, then something of one type can be assigned to something of the other type. If you would expect the compiler to complain when you try to do that, then they are not compatible types.

Therefore, struct a * and struct b * are also not compatible types, and so struct a and struct b do not share a common initial subsequence. Your union-punning is instead governed by the same rule for union punning in other cases (6.5.2.3 footnote 95):

If the member used to read the contents of a union object is not the same as the member last used to store a value in the object, the appropriate part of the object representation of the value is reinterpreted as an object representation in the new type as described in 6.2.6 (a process sometimes called ‘‘type punning’’). This might be a trap representation.


In C++, struct a and struct b also do not share a common initial subsequence. [class.mem]/p18 (quoting N4140):

Two standard-layout structs share a common initial sequence if corresponding members have layout-compatible types and either neither member is a bit-field or both are bit-fields with the same width for a sequence of one or more initial members.

[basic.types]/p9:

If two types T1 and T2 are the same type, then T1 and T2 are layout-compatible types. [ Note: Layout-compatible enumerations are described in 7.2. Layout-compatible standard-layout structs and standard-layout unions are described in 9.2. —end note ]

struct a * and struct b * are neither structs nor unions nor enumerations; therefore they are only layout-compatible if they are the same type, which they are not.

It is true that ([basic.compound]/p3)

Pointers to cv-qualified and cv-unqualified versions (3.9.3) of layout-compatible types shall have the same value representation and alignment requirements (3.11).

But that does not mean those pointer types are layout-compatible types, as that term is defined in the standard.

like image 148
T.C. Avatar answered Oct 27 '22 21:10

T.C.


What you could do (and i've been bitten by this before), is declare both struct's initial pointer to be void* and do casting. Since void is convertible to/from any pointer type, you would only be forced to pay an ugliness tax, and not risk gcc reordering your operations (which I've seen happen -- even if you use a union), as a result of compiler bugs in some versions. As @T.C. correctly points out, layout compatibility of a given type means that at the language level they are convertible; even if types might incidentally have the same size they are not necessarily layout compatible; which might give some greedy compilers to assume some other things based on that.

like image 2
Mark Nunberg Avatar answered Oct 27 '22 21:10

Mark Nunberg