Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can a type which is a union member alias that union?

Prompted by this question:

The C11 standard states that a pointer to a union can be converted to a pointer to each of its members. From Section 6.7.2.1p17:

The size of a union is sufficient to contain the largest of its members. The value of at most one of the members can be stored in a union object at any time. A pointer to a union object, suitably converted, points to each of its members (or if a member is a bit-field, then to the unit in which it resides), and vice versa.

This implies you can do the following:

union u {
    int a;
    double b;
};

union u myunion;
int *i = (int *)&u;
double *d = (double *)&u;

u.a = 2;
printf("*i=%d\n", *i);
u.b = 3.5;
printf("*d=%f\n", *d);

But what about the reverse: in case of the above union, can an int * or double * be safely converted to a union u *? Consider the following code:

#include <stdio.h>

union u {
    int a;
    double b;
};

void f(int isint, union u *p)
{
    if (isint) {
        printf("int value=%d\n", p->a);
    } else {
        printf("double value=%f\n", p->b);
    }
}

int main()
{
    int a = 3;
    double b = 8.25;
    f(1, (union u *)&a);
    f(0, (union u *)&b);
    return 0;
}

In this example, pointers to int and double, both of which are members of union u, are passed to a function where a union u * is expected. A flag is passed to the function to tell it which "member" to access.

Assuming, as in this case, that the member accessed matches the type of the object that was actually passed in, is the above code legal?

I compiled this on gcc 6.3.0 with both -O0 and -O3 and both gave the expected output:

int value=3
double value=8.250000
like image 543
dbush Avatar asked Feb 04 '19 14:02

dbush


People also ask

What is the union type in c++?

In C++17 and later, the std::variant class is a type-safe alternative for a union. A union is a user-defined type in which all members share the same memory location. This definition means that at any given time, a union can contain no more than one object from its list of members.

Can unions have functions?

A union can have member functions (including constructors and destructors), but not virtual functions. A union cannot have base classes and cannot be used as a base class. A union cannot have non-static data members of reference types.

How is a union declared?

Syntax for declaring a union is same as that of declaring a structure except the keyword struct. Note : Size of the union is the the size of its largest field because sufficient number of bytes must be reserved to store the largest sized field. To access the fields of a union, use dot(.)


2 Answers

In this example, pointers to int and double, both of which are members of union u, are passed to a function where a union u * is expected. A flag is passed to the function to tell it which "member" to access.

Assuming, as in this case, that the member accessed matches the type of the object that was actually passed in, is the above code legal?

You seem to be focusing your analysis with respect to the strict aliasing rule on the types of the union members. However, given

union a_union {
    int member;
    // ...
} my_union, *my_union_pointer;

, I would be inclined to argue that expressions of the form my_union.member and my_union_pointer->member express accessing the stored value of an object of type union a_union in addition to accessing an object of the member's type. Thus, if my_union_pointer does not actually point to an object whose effective type is union a_union then there is indeed a violation of the strict aliasing rule -- with respect to type union a_union -- and the behavior is therefore undefined.

like image 94
John Bollinger Avatar answered Sep 30 '22 02:09

John Bollinger


The Standard gives no general permission to access a struct or union object using an lvalue of member type, nor--so far as I can tell--does it give any specific permission to perform such access unless the member happens to be of character type. Nor does it define any means by which the act of casting an int* into a union u* can create one which did not already exist. Instead, the creation of any storage that will ever be accessed as a union u implies the simultaneous creation of a union u object within that storage.

Instead, the Standard (references quoted from the C11 draft N1570) relies upon implementations to apply the footnote 88 (The intent of this list is to specify those circumstances in which an object may or may not be aliased.) and recognize that the "strict aliasing rule" (6.5p7) should only be applied when an object is referenced both via an lvalue of its own type and a seemingly-unrelated lvalue of another type during some particular execution of a function or loop [i.e. when the object aliases some other lvalue].

The question of when two lvalues may be viewed as "seemingly unrelated", and when an implementations should be expected to recognize a relationship between them, is a Quality of Implementation issue. Clang and gcc seem to recognize that lvalues with forms unionPtr->value and unionPtr->value[index] are related to *unionPtr, but seem unable to recognize that pointers to such lvalues have any relationship to unionPtr. They will thus recognize that both unionPtr->array1[i] and unionPtr->array2[j] access *unionPtr (since array subscripting via [] seems to be treated differently from array-to-pointer decay), but will not recognize that *(unionPtr->array1+i) and *(unionPtr->array2+j) do likewise.

Addendum--standard reference:

Given

union foo {int x;} foo,bar;
void test(void)
{
  foo=bar;   // 1
  foo.x = 2; // 2
  bar=foo;   // 3
}

The Standard would describe the type of foo.x as int. If the second statement didn't access the stored value of foo, then the third statement would have no effect. Thus, the second statement accesses the stored value of an object of type union foo using an lvalue of type int. Looking at N1570 6.5p7:

An object shall have its stored value accessed only by an lvalue expression that has one of the following types:(footnote 88)

  • a type compatible with the effective type of the object,
  • a qualified version of a type compatible with the effective type of the object,
  • a type that is the signed or unsigned type corresponding to the effective type of the object,
  • a type that is the signed or unsigned type corresponding to a qualified version of the effective type of the object,
  • an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union), or
  • a character type.

Footnote 88) The intent of this list is to specify those circumstances in which an object may or may not be aliased.

Note that there is no permission given above to access an object of type union foo using an lvalue of type int. Because the above is a constraint, any violation thereof invokes UB even if the behavior of the construct would otherwise be defined by the Standard.

like image 33
supercat Avatar answered Sep 30 '22 01:09

supercat