Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can we use va_arg with unions?

6.7.2.1 paragraph 14 of my draft of the C99 standard has this to say about unions and pointers (emphasis, as always, added):

The size of a union is sufficient to contain the largest of its members. The value of at most one of the members can be stored in a union object at any time. A pointer to a union object, suitably converted, points to each of its members (or if a member is a bit- field, then to the unit in which it resides), and vice versa.

All well and good, that means that it is legal to do something like the following to copy either a signed or unsigned int into a union, assuming we only want to copy it out into data of the same type:

union ints { int i; unsigned u; };

int i = 4;
union ints is = *(union ints *)&i;
int j = is.i; // legal
unsigned k = is.u; // not so much

7.15.1.1 paragraph 2 has this to say:

The va_arg macro expands to an expression that has the specified type and the value of the next argument in the call. The parameter ap shall have been initialized by the va_start or va_copy macro (without an intervening invocation of the va_end macro for the sameap). Each invocation of the va_arg macro modifies ap so that the values of successive arguments are returned in turn. The parameter type shall be a type name specified such that the type of a pointer to an object that has the specified type can be obtained simply by postfixing a * to type. If there is no actual next argument, or if type is not compatible with the type of the actual next argument (as promoted according to the default argument promotions), the behavior is undefined, except for the following cases:

—one type is a signed integer type, the other type is the corresponding unsigned integer type, and the value is representable in both types;

—one type is pointer to void and the other is a pointer to a character type.

I'm not going to go and cite the part about default argument promotions. My question is: is this defined behavior:

void func(int i, ...)
{
    va_list arg;
    va_start(arg, i);
    union ints is = va_arg(arg, union ints);
    va_end(arg);
}

int main(void)
{
    func(0, 1);
    return 0;
}

If so, it would appear to be a neat trick to overcome the "and the value is compatible with both types" requirement of signed/unsigned integer conversion (albeit in a way that's rather difficult to do anything with legally). If not, it would appear to be safe to just use unsigned in this case, but what if there were more elements in the union with more incompatible types? If we can guarantee that we won't access the union by element (i.e. we just copy it into another union or storage space that we're treating like a union) and that all elements of the union are the same size, is this allowed with varargs? Or would it only be allowed with pointers?

In practice I expect this code will almost never fail, but I want to know if it's defined behavior. My current guess is that it appears not to be defined, but that seems incredibly dumb.

like image 947
Chris Lutz Avatar asked Sep 14 '11 04:09

Chris Lutz


People also ask

Can we use pointer in Union?

You can use any data type in a union, there's no restriction.

What is Va_arg?

va_arg returns the value of the next argument in a varying-length argument list. The first argument, ap , is a work area of type va_list , which is used by the expansions of the various <stdarg. h> macros.

What is Va_copy in C?

The va_copy macro copies src to dest . va_end should be called on dest before the function returns or any subsequent re-initialization of dest (via calls to va_start or va_copy).

What is Va_list in C?

va_list is a complete object type suitable for holding the information needed by the macros va_start, va_copy, va_arg, and va_end. If a va_list instance is created, passed to another function, and used via va_arg in that function, then any subsequent use in the calling function should be preceded by a call to va_end.


1 Answers

You have a couple things off.

A pointer to a union object, suitably converted, points to each of its members (or if a member is a bit- field, then to the unit in which it resides), and vice versa.

This does not mean that the types are compatible. In fact, they are not compatible. So the following code is wrong:

func(0, 1); // undefined behavior

If you want to pass a union,

func(0, (union ints){ .u = BLAH });

You can check by writing the code,

union ints x;
x = 1;

GCC gives an "error: incompatible types in assignment" message when compiling.

However, most implementations will "probably" do the right thing in both cases. There are some other problems...

union ints {
    int i;
    unsigned u;
};

int i = 4;
union ints is = *(union ints *)&i; // Invalid
int j = is.i; // legal
unsigned k = is.u; // also legal (see note)

The behavior when you dereference the address of a type using a type other than its actual type *(uinon ints *)&i is sometimes undefined (looking up the reference, but I'm pretty sure about this). However, in C99 it is permitted to access a union member other than the most recently stored union member (or is it C1x?), but the value is implementation defined and may be a trap representation.

About type punning through unions: As Pascal Cuoq notes, it's actually TC3 that defines the behavior of accessing a union element other than the most recently stored one. TC3 is the third update to C99. The good news is that this part of TC3 is really codifying existing practice — so think of it as a de facto part of C prior to TC3.

like image 194
Dietrich Epp Avatar answered Sep 22 '22 12:09

Dietrich Epp