Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Comparing struct pointers, casting away members, and UB

Consider the following code:

int main()
{
    typedef struct { int first; float second; } type;

    type whole = { 1, 2.0 };
    void * vp = &whole;

    struct { int first; } * shorn = vp;
    printf("values: %d, %d\n", ((type *)vp)->first, shorn->first);
    if (vp == shorn)
        printf("ptrs compare the same\n");

    return 0;
}

Two questions:

  1. Is the pointer equality comparison UB?
  2. Regarding the "shearing" away of the second member on the line that initializes shorn: is it valid C to cast away struct members like this and then dereference the manipulated pointer to access the remaining member?
like image 407
textral Avatar asked Dec 18 '22 12:12

textral


2 Answers

Comparing two pointers with == when one is a void * is well defined.

Section 6.5.9 of the C standard regarding the equality operator == says the following:

2 One of the following shall hold:

  • both operands have arithmetic type;
  • both operands are pointers to qualified or unqualified versions of compatible types;
  • one operand is a pointer to an object type and the other is a pointer to a qualified or unqualified version of void; or
  • one operand is a pointer and the other is a null pointer constant

...

5 Otherwise, at least one operand is a pointer. If one operand is a pointer and the other is a null pointer constant, the null pointer constant is converted to the type of the pointer. If one operand is a pointer to an object type and the other is a pointer to a qualified or unqualified version of void, the former is converted to the type of the latter.

The usage of shorn->first works because a pointer to a struct can be converted to a pointer to its first member. For both type and the unnamed struct type their first member is an int so it works out.

like image 57
dbush Avatar answered Dec 31 '22 02:12

dbush


Section 6.2.5 Types paragraph 28 of the C standard says:

[...] All pointers to structure types shall have the same representation and alignment requirements as each other. [...]

Section 6.3.2.3 Pointers paragraph 1 says:

A pointer to void may be converted to or from a pointer to any object type. A pointer to any object type may be converted to a pointer to void and back again; the result shall compare equal to the original pointer.

And paragraph 7 says:

A pointer to an object type may be converted to a pointer to a different object type. If the resulting pointer is not correctly aligned68) for the referenced type, the behavior is undefined. Otherwise, when converted back again, the result shall compare equal to the original pointer. [...]

And footnote 68 says:

In general, the concept "correctly aligned" is transitive: if a pointer to type A is correctly aligned for a pointer to type B, which in turn is correctly aligned for a pointer to type C, then a pointer to type A is correctly aligned for a pointer to type C.

Because all pointers to structure types have the same representation, the conversions between pointers to void and pointers to structure types must be the same for all pointers to structure types. So it seems that a pointer to structure type A could be converted by a cast operator directly to a pointer to structure type B without an intermediate conversion to a pointer to void as long as the pointer is "correctly aligned" for structure type B. (This may be a weak argument.)

The question remains when, in the case of two structure types A and B where the initial sequence of structure type A consists of all the members of structure type B, a pointer to structure type A is guaranteed to be correctly aligned for structure type B (the reverse is obviously not guaranteed). As far as I can tell, the C standard makes no such guarantee. So strictly speaking, a pointer to the larger structure type A might not be correctly aligned for the smaller structure type B, and if it is not, the behavior is undefined. For a "sane" compiler, the larger structure type A would not have weaker alignment than the smaller structure type B, but for an "insane" compiler, that might not be the case.


Regarding the second question about accessing members of the truncated (shorter) structure using the pointer derived from the full (longer) structure, then as long as the pointer is correctly aligned for the shorter structure (see above for why that might not be true for an "insane" compiler), and as long as strict aliasing rules are avoided (for example, by going through an intermediate pointer to void in an intermediate external function call across compilation unit boundaries), then accessing the members through the pointer to the shorter structure type should be perfectly fine. There is a special guarantee for that when objects of both structure types appear as members of the same union type. Section 6.3.2.3 Structure and union members paragraph 6 says:

One special guarantee is made in order to simplify the use of unions: if a union contains several structures that share a common initial sequence (see below), and if the union object currently contains one of these structures, it is permitted to inspect the common initial part of any of them anywhere that a declaration of the completed type of the union is visible. Two structures share a common initial sequence if corresponding members have compatible types (and, for bit-fields, the same widths) for a sequence of one or more initial members.

However, since the offsets of members within a structure type does not depend on whether an object of the structure type appears in a union type or not, the above implies that any structures with a common initial sequence of members will have those common members at the same offsets within their respective structure types.

like image 41
Ian Abbott Avatar answered Dec 31 '22 00:12

Ian Abbott