Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is full followed by partial initialization of a subobject undefined behavior?

Consider the following struct initialization:

#include<stdio.h>

struct bar {
    int b;
    int a;
    int r;
};

struct foo {
    struct bar bar;
};

int main(int argc, char **argv) {
    struct bar b = {1, 2, 3};
    struct foo f = {.bar = b, .bar.a = 5 };
    // should this print "1, 5, 3", "1, 5, 0", or "0, 5, 0"?
    // clang on Mac prints "1, 5, 3", while gcc on Ubuntu prints "0, 5, 0" 
    printf("%d, %d, %d\n", f.bar.b, f.bar.a, f.bar.r);

    return 0;
}

The C11 standard seems to do a quite poor job of describing what behavior should be expected here in section 6.7.9, but seems to think it's doing a reasonable job, as I don't see any warnings regarding undefined behavior in this case either.

In practice, it seems the behavior is either not standardized or the standard is violated by at least one common compiler, with clang/llvm 8.0.0 on a Mac producing "1, 5, 3", and gcc 5.4 on Ubuntu producing "0, 5, 0".

According to the C standard, should f.bar.b and f.bar.r well defined at this point, or does this initialization result in undefined or unspecified behavior?

like image 256
Theodore Murdock Avatar asked Dec 01 '16 21:12

Theodore Murdock


2 Answers

The C11 standard seems to do a quite poor job of describing what behavior should be expected here in section 6.7.9,

Standardese can be difficult to read, but I don't think this area of the standard is worse in that respect than should be expected.

but seems to think it's doing a reasonable job, as I don't see any warnings regarding undefined behavior in this case either.

The standard is not required to explicitly declare undefined behaviors. Indeed, the standard contains a blanket statement that wherever it does not define behavior for a given piece of code, that code's behavior is undefined. Nevertheless, I do think section 6.7.9 covers this area pretty thoroughly. The main area left open is this:

The evaluations of the initialization list expressions are indeterminately sequenced with respect to one another and thus the order in which any side effects occur is unspecified.

(C2011, 6.7.9/23)

That doesn't present any problem for your example.

In practice, it seems the behavior is either not standardized or the standard is violated by at least one common compiler, with clang/llvm on a Mac producing "1, 5, 3", and gcc on Ubuntu producing "0, 5, 0".

I'm completely prepared to believe that one or the other of those is non-conforming in this area. However, do also pay attention to compiler versions and compilation options -- they may be compiling for different versions of the standard, with or without extensions.

According to the C standard, should f.bar.b and f.bar.r well defined at this point, or does this initialization result in undefined or unspecified behavior?

If the declaration of an object has an associated initializer then the whole object is initialized, and furthermore, the resulting initial value is well-defined by the standard, subject to caveats arising from 6.7.9/23. As for the initial values required of a conforming implementation in your example, the key provisions are these:

The initialization shall occur in initializer list order, each initializer provided for a particular subobject overriding any previously listed initializer for the same subobject; all subobjects that are not initialized explicitly shall be initialized implicitly the same as objects that have static storage duration.

(C2011, 6.7.9/19; emphasis added)

Each designator list begins its description with the current object associated with the closest surrounding brace pair. Each item in the designator list (in order) specifies a particular member of its current object and changes the current object for the next designator (if any) to be that member. The current object that results at the end of the designator list is the subobject to be initialized by the following initializer.

(C2011, 6.7.9/18; emphasis added)

If the aggregate or union contains elements or members that are aggregates or unions, these rules apply recursively to the subaggregates or contained unions.

(C2011, 6.7.9/20)

Thus, given f's initializer,

    struct foo f = {.bar = b, .bar.a = 5 };

we first process element .bar = b, as required by 6.7.9/19. That contains a designator list designating foo.b, of type struct bar, as the object to initialize from the following initializer. This initializer exercises the option of being "a single expression that has compatible structure or union type", per 6.7.9/13, therefore the initial value of f.bar is the value of b, subject to partial or full override by subsequent initializers.

We next process the second element, .bar.a = 5. This initializes f.bar.a and only that subobject, per 6.7.9/18, overriding the initialization specified by the previous initializer per 6.7.9/19.

The result of conforming initialization thus leads to printing

1, 5, 3

GCC seems to be failing by re-initializing all of f.bar when it processes the the second initializer, instead of only f.bar.a.

like image 154
John Bollinger Avatar answered Oct 18 '22 15:10

John Bollinger


In the C Standard there is written (6.7.9 Initialization)

17 Each brace-enclosed initializer list has an associated current object. When no designations are present, subobjects of the current object are initialized in order according to the type of the current object: array elements in increasing subscript order, structure members in declaration order, and the first named member of a union.148) In contrast, a designation causes the following initializer to begin initialization of the subobject described by the designator. Initialization then continues forward in order, beginning with the next subobject after that described by the designator

And

19 The initialization shall occur in initializer list order, each initializer provided for a particular subobject overriding any previously listed initializer for the same subobject;151) all subobjects that are not initialized explicitly shall be initialized implicitly the same as objects that have static storage duration.

This footnote is important

148) If the initializer list for a subaggregate or contained union does not begin with a left brace, its subobjects are initialized as usual, but the subaggregate or contained union does not become the current object: current objects are associated only with brace-enclosed initializer lists.

Thus I see neither undefined nor unspecified behavior.

In my opinion the result should look like { 1, 5, 3 }.

If to leave aside the Standard then it is reasonable at first to initialize the memory with the default initializes and then overwrite it with the explicit initializers.

like image 20
Vlad from Moscow Avatar answered Oct 18 '22 15:10

Vlad from Moscow