Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C11 related language correctness

The following code snippet is an example from the C11 standard §6.5.2.3:

struct t1 { int m; };
struct t2 { int m; };
int f(struct t1 *p1, struct t2 *p2)
{
    if (p1->m < 0)
        p2->m = -p2->m;
    return p1->m;
}
int g()
{
    union {
        struct t1 s1;
        struct t2 s2;
    } u;
    /* ... */
    return f(&u.s1, &u.s2);
}

As per C11, the last line inside g() is invalid. Why so?

like image 887
Vikas Yadav Avatar asked Jul 26 '17 22:07

Vikas Yadav


2 Answers

The example comes from Example 3 in §6.5.2.3 Structure and union members of ISO/IEC 9899:2011. One of the prior paragraphs is (emphasis added):

¶6 One special guarantee is made in order to simplify the use of unions: if a union contains several structures that share a common initial sequence (see below), and if the union object currently contains one of these structures, it is permitted to inspect the common initial part of any of them anywhere that a declaration of the completed type of the union is visible. Two structures share a common initial sequence if corresponding members have compatible types (and, for bit-fields, the same widths) for a sequence of one or more initial members.

The code quoted in the question is preceded by the comment:

The following is not a valid fragment (because the union type is not visible within function f).

This now makes sense in light of the highlighted statement. The code in g() is making use of the common initial sequence, but that only applies where the union is visible and it isn't visible in f().

The issue is also one of strict aliasing. That's a complex topic. See What is the strict aliasing rule? for the details.

For whatever it is worth, GCC 7.1.0 doesn't report the problem even under stringent warning options. Neither does Clang, even with the -Weverything option:

clang -O3 -g -std=c11 -Wall -Wextra -Werror -Wmissing-prototypes \
    -Wstrict-prototypes -Weverything -pedantic …
like image 55
Jonathan Leffler Avatar answered Sep 17 '22 12:09

Jonathan Leffler


This is because of the "effective type" rule. If you see f isolated, the two arguments have different type, and the compiler is allowed to do certain optimizations.

Here, p1 is accessed twice. If p1 and p2 are supposed to be different, the compiler needs not to reload p1's value for the return since it cannot have changed.

f is valid code, and the optimization is valid.

Calling it with the same object, in g, is not valid, because without seeing that both may come from the same union the compiler may not take provisions to avoid the optimization.

This is one of the cases, where the whole burden to prove that a call is valid lays on the user of a function, generally no compiler can warn you about this if f and g happen to be in different translation units.

like image 28
Jens Gustedt Avatar answered Sep 16 '22 12:09

Jens Gustedt