Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Confusion about array initialization in C

In C language, if initialize an array like this:

int a[5] = {1,2}; 

then all the elements of the array that are not initialized explicitly will be initialized implicitly with zeros.

But, if I initialize an array like this:

int a[5]={a[2]=1};  printf("%d %d %d %d %d\n", a[0], a[1],a[2], a[3], a[4]); 

output:

1 0 1 0 0 

I don't understand, why does a[0] print 1 instead of 0? Is it undefined behaviour?

Note: This question was asked in an interview.

like image 544
msc Avatar asked Sep 13 '18 05:09

msc


People also ask

What happens when you initialize an array in C?

The remaining array elements will be automatically initialized to zero. If an array is to be completely initialized, the dimension of the array is not required. The compiler will automatically size the array to fit the initialized data.

What is the correct way of initializing an array in C?

Initializer List: To initialize an array in C with the same value, the naive way is to provide an initializer list. We use this with small arrays. int num[5] = {1, 1, 1, 1, 1}; This will initialize the num array with value 1 at all index.

Which is the incorrect way of initializing an array?

Solution(By Examveda Team)option (B), (C) and (D) are incorrect because array declaration syntax is wrong. Only square brackets([]) must be used for declaring an array.

What will happen if initialization is not done in array?

Even if you do not initialize the array, the Java compiler will not give any error. Normally, when the array is not initialized, the compiler assigns default values to each element of the array according to the data type of the element.


1 Answers

TL;DR: I don't think the behavior of int a[5]={a[2]=1}; is well defined, at least in C99.

The funny part is that the only bit that makes sense to me is the part you're asking about: a[0] is set to 1 because the assignment operator returns the value that was assigned. It's everything else that's unclear.

If the code had been int a[5] = { [2] = 1 }, everything would've been easy: That's a designated initializer setting a[2] to 1 and everything else to 0. But with { a[2] = 1 } we have a non-designated initializer containing an assignment expression, and we fall down a rabbit hole.


Here's what I've found so far:

  • a must be a local variable.

    6.7.8 Initialization

    1. All the expressions in an initializer for an object that has static storage duration shall be constant expressions or string literals.

    a[2] = 1 is not a constant expression, so a must have automatic storage.

  • a is in scope in its own initialization.

    6.2.1 Scopes of identifiers

    1. Structure, union, and enumeration tags have scope that begins just after the appearance of the tag in a type specifier that declares the tag. Each enumeration constant has scope that begins just after the appearance of its defining enumerator in an enumerator list. Any other identifier has scope that begins just after the completion of its declarator.

    The declarator is a[5], so variables are in scope in their own initialization.

  • a is alive in its own initialization.

    6.2.4 Storage durations of objects

    1. An object whose identifier is declared with no linkage and without the storage-class specifier static has automatic storage duration.

    2. For such an object that does not have a variable length array type, its lifetime extends from entry into the block with which it is associated until execution of that block ends in any way. (Entering an enclosed block or calling a function suspends, but does not end, execution of the current block.) If the block is entered recursively, a new instance of the object is created each time. The initial value of the object is indeterminate. If an initialization is specified for the object, it is performed each time the declaration is reached in the execution of the block; otherwise, the value becomes indeterminate each time the declaration is reached.

  • There is a sequence point after a[2]=1.

    6.8 Statements and blocks

    1. A full expression is an expression that is not part of another expression or of a declarator. Each of the following is a full expression: an initializer; the expression in an expression statement; the controlling expression of a selection statement (if or switch); the controlling expression of a while or do statement; each of the (optional) expressions of a for statement; the (optional) expression in a return statement. The end of a full expression is a sequence point.

    Note that e.g. in int foo[] = { 1, 2, 3 } the { 1, 2, 3 } part is a brace-enclosed list of initializers, each of which has a sequence point after it.

  • Initialization is performed in initializer list order.

    6.7.8 Initialization

    1. Each brace-enclosed initializer list has an associated current object. When no designations are present, subobjects of the current object are initialized in order according to the type of the current object: array elements in increasing subscript order, structure members in declaration order, and the first named member of a union. [...]

     

    1. The initialization shall occur in initializer list order, each initializer provided for a particular subobject overriding any previously listed initializer for the same subobject; all subobjects that are not initialized explicitly shall be initialized implicitly the same as objects that have static storage duration.
  • However, initializer expressions are not necessarily evaluated in order.

    6.7.8 Initialization

    1. The order in which any side effects occur among the initialization list expressions is unspecified.

However, that still leaves some questions unanswered:

  • Are sequence points even relevant? The basic rule is:

    6.5 Expressions

    1. Between the previous and next sequence point an object shall have its stored value modified at most once by the evaluation of an expression. Furthermore, the prior value shall be read only to determine the value to be stored.

    a[2] = 1 is an expression, but initialization is not.

    This is slightly contradicted by Annex J:

    J.2 Undefined behavior

    • Between two sequence points, an object is modified more than once, or is modified and the prior value is read other than to determine the value to be stored (6.5).

    Annex J says any modification counts, not just modifications by expressions. But given that annexes are non-normative, we can probably ignore that.

  • How are the subobject initializations sequenced with respect to initializer expressions? Are all initializers evaluated first (in some order), then the subobjects are initialized with the results (in initializer list order)? Or can they be interleaved?


I think int a[5] = { a[2] = 1 } is executed as follows:

  1. Storage for a is allocated when its containing block is entered. The contents are indeterminate at this point.
  2. The (only) initializer is executed (a[2] = 1), followed by a sequence point. This stores 1 in a[2] and returns 1.
  3. That 1 is used to initialize a[0] (the first initializer initializes the first subobject).

But here things get fuzzy because the remaining elements (a[1], a[2], a[3], a[4]) are supposed to be initialized to 0, but it's not clear when: Does it happen before a[2] = 1 is evaluated? If so, a[2] = 1 would "win" and overwrite a[2], but would that assignment have undefined behavior because there is no sequence point between the zero initialization and the assignment expression? Are sequence points even relevant (see above)? Or does zero initialization happen after all initializers are evaluated? If so, a[2] should end up being 0.

Because the C standard does not clearly define what happens here, I believe the behavior is undefined (by omission).

like image 149
melpomene Avatar answered Oct 22 '22 14:10

melpomene