Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is this non-null terminated string printed correctly

Yesterday, I had my Unit Test. One of the programs was to copy strings and find out its length without the string functions. This was the code I wrote:

#include <stdio.h>

int main(){
    char str1[100], str2[100] = {'a'};

    printf("Enter a string\n");
    fgets(str1, sizeof(str1), stdin);

    int i;
    for(i = 0; str1[i] != '\0'; i++){
        str2[i] = str1[i];
    }   

    str2[i] = '\0';

    printf("Copied string = %s", str2);

    printf("Length of string = %d", i-1);
}

I had a rather surprising observation! Even if a commented str2[i] = '\0', the string would be printed correctly i.e., without the extra 'a's in the initialization which should not be overwritten as per my knowledge.

After commenting str2[i] = '\0', i expected to see this output:

test
Copied string = testaaaaaaaaaaaaaaaaaaaaaaaaaaa....
Length of string = 4

This is the output:

test
Copied string = test
Length of string = 4

How is str2 printed correctly? Is it the fact that the compiler recognized the copying of the string and silently added the null termination? I am using gcc but clang also produces similar output.

like image 315
Hemil Avatar asked Apr 03 '19 05:04

Hemil


2 Answers

str2[100] = {'a'}; does not fill str2 with 100 repeated a. It just sets str[0] to 'a' and the rest to zero.

As far back as C89:

3.5.7 Initialization

...

Semantics

...

If an object that has static storage duration is not initialized explicitly, it is initialized implicitly as if every member that has arithmetic type were assigned 0 and every member that has pointer type were assigned a null pointer constant. If an object that has automatic storage duration is not initialized explicitly, its value is indeterminate./65/

...

If there are fewer initializers in a list than there are members of an aggregate, the remainder of the aggregate shall be initialized implicitly the same as objects that have static storage duration.

like image 78
Mark Tolonen Avatar answered Sep 21 '22 02:09

Mark Tolonen


First, the rule of initialization for aggregate types[1], quoting C11, chapter 6.7.9 (emphasis mine)

The initialization shall occur in initializer list order, each initializer provided for a particular subobject overriding any previously listed initializer for the same subobject;151) all subobjects that are not initialized explicitly shall be initialized implicitly the same as objects that have static storage duration.

and,

If an object that has static or thread storage duration is not initialized explicitly, then:

  • if it has pointer type, it is initialized to a null pointer;

  • if it has arithmetic type, it is initialized to (positive or unsigned) zero;

  • if it is an aggregate, every member is initialized (recursively) according to these rules, and any padding is initialized to zero bits;

  • if it is a union, the first named member is initialized (recursively) according to these rules, and any padding is initialized to zero bits;

Now, an initialization statement like

char str2[100] = {'a'};

will initialize str2[0] to 'a', and str2[1] through str2[99] with 0, according to the above rule. That 0 value is the null-terminator for strings.

Thus, any value you store there, lesser than the length of the array, up to the length-1 element, is automatically going to be terminated by a null.

So, you're okay to use the array as string and get the expected behavior of that of a string.


[1]: Aggregate types:

According to chapter 6.2.5/P21

[...] Array and structure types are collectively called aggregate types.

like image 39
Sourav Ghosh Avatar answered Sep 21 '22 02:09

Sourav Ghosh