Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is it legal to cast a struct to an union containing it?

This is a followup to a question about printing a common member from different structs

I thought that unions permit to examine the common initial sequence of two of their elements. So I ended with the following code:

#include <stdio.h>

struct foo {
    const char *desc;
    float foo;
};
struct bar {
    const char *desc;
    int bar;
};
union foobar {
    struct foo foo;
    struct bar bar;
};

void printdesc(const union foobar * fb) {
    printf("%s\n", fb->foo.desc);          // allowed per 6.5.2.3 Structure and union members
}

int main() {

    struct bar bb = {"desc bar", 2};

    union foobar fb = { .bar=bb};

    printdesc((union foobar *) &(fb.bar)); // allowed per 6.7.2.1 Structure and union specifiers
    printdesc((union foobar *) &bb);       // legal?

    return 0;
}

It compiles without even a warning and gives the expected result

desc bar
desc bar

The point here is the line with the // legal? comment. I have converted a bar * into a foobar *. When the bar in a member of a foobar union, it is permitted per 6.7.2.1 Structure and union specifiers. But here I do not know.

Is it permitted to convert a pointer to a bar object to a pointer to a foobar object if the bar was not declared as a member of a foobar?

The question is not about whether it can work in a specific compiler. I am pretty sure that it does with all the common compilers in their current versions. The question is about whether it is legal C code.


Here is my current research.

References from draft n1570 for C11:

6.5.2.3 Structure and union members § 6

... if a union contains several structures that share a common initial sequence (see below), and if the union object currently contains one of these structures, it is permitted to inspect the common initial part of any of them...

6.7.2.1 Structure and union specifiers § 16

... A pointer to a union object, suitably converted, points to each of its members ..., and vice versa...

like image 712
Serge Ballesta Avatar asked Mar 11 '21 10:03

Serge Ballesta


People also ask

Is it allowed to write structure inside union?

A structure can be nested inside a union and it is called union of structures. It is possible to create a union inside a structure.

Why should we use structure instead of union?

Both structure and union are user-defined data types in C programming that hold multiple members of different data types. Structures are used when we need to store distinct values for all the members in a unique memory location, while unions help manage memory efficiently.

Which is better to use union or struct?

Union takes less memory space as compared to the structure. Only the largest size data member can be directly accessed while using a union. It is used when you want to use less (same) memory for different data members. It allocates memory size to all its data members to the size of its largest data member.

Is size of struct and union same?

The size of a structure is the sum of the size of all data members and the packing size. The size of the union is the size of its data member, which is the largest in size. Only the latest initialized data member stores the value. Only one data member can be initialized at a time.


1 Answers

One serious landmine off the top of my head:

void copydesc(const union foobar * fb, union foobar * fb_copy) {
  static_assert(sizeof(*fb) == sizeof(union foobar), "broken compiler");
  // NOTE: Buffer-overflow when struct cast smaller than `union foobar`.
  memcpy(fb_copy, fb, sizeof(*fb));
}

Most programmers won't catch this and the few that do are too valuable to waste time on it. And compilers/static-analysis-tools may not warn against your technique.

A safer approach, still not necessarily bullet-proof, is to add a struct with the same first shared fields, add constructor tests, and list the common struct first just like you wonderfully packed them in your example to make it more readable.

It's a bad sign whenever you find yourself wondering whether some not-so-common construct works correctly. It can be argued that such constructs are useful if you really know what you're doing (that's one feature of C) but this introduces significant risk that an unexpected chain reaction of independently correct parts (like the buffer-overflow above) causes serious issues.

like image 112
Abdullah Avatar answered Nov 13 '22 06:11

Abdullah