Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Returning a local partially initialized struct from a function and undefined behavior

(By partially initialized I mean defined as uninitialized and one of its members is set to some valid value, but not all of them. And by local I mean defined with automatic storage duration. This question only talks about those.)

Using an automatic uninitialized variable that could be defined with register, as an rvalue is undefined behavior. Structs can be defined with register storage class specifier.

6.3.2.1

  1. If the lvalue designates an object of automatic storage duration that could have been declared with the register storage class (never had its address taken), and that object is uninitialized (not declared with an initializer and no assignment to it has been performed prior to use), the behavior is undefined.

Note that it specifically says that and no assignments to it has been performed.

Additionally we know that a struct cannot be a trap value:

6.2.6.1.

  1. The value of a structure or union object is never a trap representation, even though the value of a member of the structure or union object may be a trap representation

Thus returning an uninitialized struct is clearly undefined behavior.

Statement: Returning an uninitialized struct that had one of its members assigned with a valid value, is defined.

Example for easier comprehension:

struct test
{
    int a;
    int b;
};

struct test Get( void )
{
    struct test g;
    g.a = 123;
    return g;
}

{
    struct test t = Get();
}

I just happened to focus on returning, but I believe this should apply to a simple assignment as well, without any difference.

Is my statement correct?

like image 223
2501 Avatar asked Feb 18 '16 20:02

2501


2 Answers

Aside from the detail of returning the value from a function, this is precisely the subject of Defect Report 222, submitted in 2000 by Clive Feather, and the resolution of that DR seems to pretty clearly answer the question: returning a partially-uninitialized struct is well-defined (although the values of the uninitialized members may not be used.)

The resolution to the DR clarified that struct and union objects do not have trap representations (which was explicitly added to §6.2.6.1/6). Consequently member-by-member copying cannot be used on an architecture in which the individual members might trap. Although, presumably for parsimony, no explicit statement to this effect was added to the standard, footnote 42 (now footnote 51) which previously mentioned the possibility of member-by-member copying was replaced by a much weaker statement indicating that padding bits need not be copied.

The minutes of the WG14 meeting (Toronto, October 2000) are clear (emphasis added):

DR222 - Partially-initialized structures

This DR asks the question of whether or not struct assignment is well defined when the source of the assignment is a struct, some of whose members have not been given a value. There was consensus that this should be well defined because of common usage, including the standard-specified structure struct tm. There was also consensus that if assignment with some members uninitialized (and thus possibly having a trap value) was being made well defined, there was little value in requiring that at least one member had been properly given a value.
Therefore the notion that the value of a struct or union as a whole can have a trap value is being removed.

It's interesting to note that in the above minutes, the committee held that it was not even necessary that a single member of the struct had been given a value. However, that requirement was later reinstated in some cases, with the resolution to DR338 (see below).

In summary:

  • If an automatic aggregate object has been at least partially initialized or if its address has been taken (thereby rendering it not suitable for a register declaration as per §6.3.2.1/2), then lvalue-to-rvalue conversion of that object is well-defined.

  • Such an object can be assigned to another aggregate object of the same type, possibly after having been returned from a function, without invoking undefined behaviour.

  • Reading the uninitialized members in the copy is either undefined or indeterminate, depending on whether trap representations are possible. (A read through a pointer to an unsigned narrow character type cannot trap, for example.) But if you write the member before reading it, you're fine.

I don't believe there is any theoretical difference between assignment of union and struct objects. Obviously unions cannot be copied member by member (what would that even mean), and that the fact that some inactive member happens to have a trap representation is irrelevant, even if that member is not aliased by any other element. There's no obvious reason why a struct should be any different.

Finally, with respect to the exception in §6.3.2.1/2: this was added as a result of the resolution to DR 338. The gist of that DR is that some hardware (IA64) can trap the use of an uninitialized value in a register. C99 does not permit trap representations for unsigned chars. So on such hardware, it might not be possible to maintain an automatic variable in a register without "unnecessarily" initializing the register.

The resolution to DR 338 specifically marks as undefined behaviour the use of uninitialized values in automatic variables which could conceivably be stored in registers (i.e., those whose address has never been taken, as though declared register), thus permitting the compiler to keep an automatic unsigned char in a register without worrying about the previous contents of that register.

As a side effect of DR 338, it appears that completely uninitialized automatic structs whose address has never been taken cannot undergo lvalue-to-rvalue conversion. I don't know if that side-effect was fully contemplated in the resolution to DR 338, but it does not apply in the case of a partially initialized struct, as in this question.

like image 118
rici Avatar answered Oct 22 '22 03:10

rici


Your statement about 6.3.2.1 is correct, if the object assigned to the lvalue is uninitialized, then the behavior is undefined.

So the question then is if your struct is to be regarded as uninitialized or not. You do assign a value to one of the members, so there has been an assignment to the object. As per the cited 6.3.2.1, that would mean that you cannot regard the struct as whole as uninitialized. That particular member is clearly initialized, even though the other members are not.

There is however another case of undefined behavior, and that is when storing a trap representation into the lvalue:

6.2.6.1/5
Certain object representations need not represent a value of the object type. If the stored value of an object has such a representation and is read by an lvalue expression that does not have character type, the behavior is undefined. If such a representation is produced by a side effect that modifies all or any part of the object by an lvalue expression that does not have character type, the behavior is undefined.50) Such a representation is called a trap representation.

The text you cited in 6.2.6.1/6 says that the struct itself cannot be a trap representation, even though its individual members may be trap representations. If they are, then the assignment would be undefined behavior as per the above.

But note the "may be trap". It is not certain that they are trap representations, because they have indeterminate values. Take a look at the basics:

6.7.9/10
If an object that has automatic storage duration is not initialized explicitly, its value is indeterminate.

and

3.19.2/1
indeterminate value
either an unspecified value or a trap representation

Using a variable with indeterminate value is only undefined behavior in case the value is a trap representation.

Whether the uninitialized member variables of your struct will contain unspecified values or trap representations is implementation-defined behavior.

If the variable with indeterminate value simply has an unspecified value, then 6.2.6.1/5 does not apply and there is no undefined behavior.

Conclusion: if the implementation states that any indeterminate value for any of the struct members is a trap representation, the behavior is undefined. Otherwise, the behavior is merely implementation-defined/unspecified, the uninitialized members will hold unspecified values.

like image 27
Lundin Avatar answered Oct 22 '22 03:10

Lundin