I've been writing embedded C code for many years now, and the newer generations of compilers and optimizations have certainly gotten a lot better with respect to their ability to warn about questionable code.
However, there is at least one (very common, in my experience) use-case that continues to cause grief, wheres a common base type is shared between multiple structs. Consider this contrived example:
#include <stdio.h>
struct Base
{
unsigned short t; /* identifies the actual structure type */
};
struct Derived1
{
struct Base b; /* identified by t=1 */
int i;
};
struct Derived2
{
struct Base b; /* identified by t=2 */
double d;
};
struct Derived1 s1 = { .b = { .t = 1 }, .i = 42 };
struct Derived2 s2 = { .b = { .t = 2 }, .d = 42.0 };
void print_val(struct Base *bp)
{
switch(bp->t)
{
case 1:
{
struct Derived1 *dp = (struct Derived1 *)bp;
printf("Derived1 value=%d\n", dp->i);
break;
}
case 2:
{
struct Derived2 *dp = (struct Derived2 *)bp;
printf("Derived2 value=%.1lf\n", dp->d);
break;
}
}
}
int main(int argc, char *argv[])
{
struct Base *bp1, *bp2;
bp1 = (struct Base*) &s1;
bp2 = (struct Base*) &s2;
print_val(bp1);
print_val(bp2);
return 0;
}
Per ISO/IEC9899, the casts within code above should be OK, as it relies on the first member of the structure sharing the same address as the containing structure. Clause 6.7.2.1-13 says so:
Within a structure object, the non-bit-field members and the units in which bit-fields
reside have addresses that increase in the order in which they are declared. A pointer to a
structure object, suitably converted, points to its initial member (or if that member is a
bit-field, then to the unit in which it resides), and vice versa. There may be unnamed
padding within a structure object, but not at its beginning.
The casts from derived to base work fine, but the cast back to the derived type within print_val()
generates an alignment warning. However this is known to be safe as it is specifically the "vice versa" part of the clause above. The problem is that the compiler simply doesn't know that the we've already guaranteed that the structure is in fact an instance of the other type via other means.
When compiled with gcc version 9.3.0 (Ubuntu 20.04) using flags -std=c99 -pedantic -fstrict-aliasing -Wstrict-aliasing -Wcast-align=strict -O3
I get:
alignment-1.c: In function ‘print_val’:
alignment-1.c:30:31: warning: cast increases required alignment of target type [-Wcast-align]
30 | struct Derived1 *dp = (struct Derived1 *)bp;
| ^
alignment-1.c:36:31: warning: cast increases required alignment of target type [-Wcast-align]
36 | struct Derived2 *dp = (struct Derived2 *)bp;
| ^
A similar warning occurs in clang 10.
Rework 1: pointer to pointer
A method used in some circumstances to avoid the alignment warning (when the pointer is known to be aligned, as is the case here) is to use an intermediate pointer-to-pointer. For instance:
struct Derived1 *dp = *((struct Derived1 **)&bp);
However this just trades the alignment warning for a strict aliasing warning, at least on gcc:
alignment-1a.c: In function ‘print_val’:
alignment-1a.c:30:33: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing]
30 | struct Derived1 *dp = *((struct Derived1 **)&bp);
| ~^~~~~~~~~~~~~~~~~~~~~~~~
Same is true if cast done as an lvalue, that is: *((struct Base **)&dp) = bp;
also warns in gcc.
Notably, only gcc complains about this one - clang 10 seems to accept this either way without warning, but I'm not sure if that's intentional or not.
Rework 2: union of structures
Another way to rework this code is using a union. So the print_val()
function can be rewritten something like:
void print_val(struct Base *bp)
{
union Ptr
{
struct Base b;
struct Derived1 d1;
struct Derived2 d2;
} *u;
u = (union Ptr *)bp;
...
The various structures can be accessed using the union. While this works fine, the cast to a union is still flagged as violating alignment rules, just like the original example.
alignment-2.c:33:9: warning: cast from 'struct Base *' to 'union Ptr *' increases required alignment from 2 to 8 [-Wcast-align]
u = (union Ptr *)bp;
^~~~~~~~~~~~~~~
1 warning generated.
Rework 3: union of pointers
Rewriting the function as follows compiles cleanly in both gcc and clang:
void print_val(struct Base *bp)
{
union Ptr
{
struct Base *bp;
struct Derived1 *d1p;
struct Derived2 *d2p;
} u;
u.bp = bp;
switch(u.bp->t)
{
case 1:
{
printf("Derived1 value=%d\n", u.d1p->i);
break;
}
case 2:
{
printf("Derived2 value=%.1lf\n", u.d2p->d);
break;
}
}
}
There seems to be conflicting information out there as to whether this is truly valid. In particular, an older aliasing write-up at https://cellperformance.beyond3d.com/articles/2006/06/understanding-strict-aliasing.html specifically calls out a similar construct as being invalid (see Casting through a union (3) in that link).
In my understanding, because pointer members of the union all share a common base type, this doesn't actually violate any aliasing rules, because all accesses to struct Base
will in fact be done via an object of type struct Base
- whether by dereferencing the bp
union member or by accessing the b
member object of the d1p
or d2p
. Either way it is accessing the member correctly via an object of type struct Base
- so as far as I can tell, there is no alias.
Specific Questions:
It seems to me that since this pattern is fairly common in C code (in the absence of true OO constructs like in C++) that it should be more straightforward to do this in a portable way without getting warnings in one form or another.
Thanks in advance!
Update:
Using an intermediate void*
may be the "right" way to do this:
struct Derived1 *dp = (void*)bp;
This certainly works but it really allows any conversion at all, regardless of type compatibility (I suppose the weaker type system of C is fundamentally to blame for this, what I really want is an approximation of C++ and the static_cast<>
operator)
However, my fundamental question (misunderstanding?) about strict aliasing rules remains:
Why does using a union type and/or pointer-to-pointer violate strict aliasing rules? In other words what is fundamentally different between what is done in main (taking address of b
member) and what is done in print_val()
other than the direction of the conversion? Both yield the same situation - two pointers that point to the same memory, which are different struct types - a struct Base*
and a struct Derived1*
.
It would seem to me that if this were violating strict aliasing rules in any way, the introduction of an intermediate void*
cast would not change the fundamental problem.
You can avoid the compiler warning by casting to void *
first:
struct Derived1 *dp = (struct Derived1 *) (void *) bp;
(After the cast to void *
, the conversion to struct Derived1 *
is automatic in the above declaration, so you could remove the cast.)
The methods of using a pointer-to-a-pointer or a union to reinterpret a pointer are not correct; they violate the aliasing rule, as a struct Derived1 *
and a struct Base *
are not suitable types for aliasing each other. Do not use those methods.
(Due to C 2018 6.2.6.1 28, which says “… All pointers to structure types shall have the same representation and alignment requirements as each other…,” an argument can be made that reinterpreting one pointer-to-a-structure as another through a union is supported by the C standard. Footnote 49 says “The same representation and alignment requirements are meant to imply interchangeability as arguments to functions, return values from functions, and members of unions.” At best, however, this is a kludge in the C standard and should be avoided when possible.)
Why does using a union type and/or pointer-to-pointer violate strict aliasing rules? In other words what is fundamentally different between what is done in main (taking address of
b
member) and what is done inprint_val()
other than the direction of the conversion? Both yield the same situation - two pointers that point to the same memory, which are different struct types - astruct Base*
and astruct Derived1*
.It would seem to me that if this were violating strict aliasing rules in any way, the introduction of an intermediate
void*
cast would not change the fundamental problem.
The strict aliasing violation occurs in aliasing the pointer, not in aliasing the structure.
If you have a struct Derived1 *dp
or a struct Base *bp
and you use it to access a place in memory where there actually is a struct Derived1
or, respectively, a struct Base
, then there is no aliasing violation because you are accessing an object through an lvalue of its type, which is allowed by the aliasing rule.
However, this question suggested aliasing a pointer. In *((struct Derived1 **)&bp);
, &bp
is the location where there is a struct Base *
. This address of a struct Base *
is converted to the address of a struct Derived1 **
, and then *
forms an lvalue of type struct Derived1 *
. The expression is then used to access a struct Base *
using a type of struct Derived1 *
. There is no match for that in the aliasing rule; none of the types it lists for accessing a struct Base *
are a struct Derived1 *
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With