Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Printing a Local String in C

Tags:

c

I was confused with this C code in a test question:

#include <stdio.h>

union Data {
    char var1;
    char var2;
    char varArr[10];
};

int main() {
    union Data data;
    data.var1 = 'a';
    data.var2 = 'b';
    data.varArr[0] = 'c';
    char ctr[3];
    ctr[0] = data.var1;
    ctr[1] = data.var2;
    ctr[2] = data.varArr[0];
    printf("Result: %s\n", ctr);
    return 0;
}

The question I saw asked me what this code would print, and the obvious answer was ccc, but the fact that the array's length is 3 and all of the elements are just c bugged me (no \0 in the end).

I thought that maybe it could print some garbage sometimes and other times it would work as expected, but after trying a few times, I saw that it always works as expected.

After that, I decided to go out of bounds and check what the following bytes after the array had, specifically in ctr[3], ctr[4] and ctr[5], to go a little further. I noticed that these bytes are generally, but not always, \0. When they are not \0, they generally take small garbage values like 0x01 or 0x04 (Are they really garbage values in this case? I don't know), but again, ccc was printed in those cases, as well.

I have no clue why this happens. I first thought that maybe some word-size chunk allocation stuff causes the following bytes to be zero, but after seeing they could sometimes be nonzero left me clueless.

After that, I tried to simplify the code:

#include <stdio.h>

int main() {
    char ctr[3];
    ctr[0] = 'c';
    ctr[1] = 'c';
    ctr[2] = 'c';
    printf("Result: %s\n", ctr);
    return 0;
}

This code printed cccX (X is some garbage value that seemed to be changing both in size and content on different runs.) So, the two codes have some differences in memory caused by the local data variable in the first code, but I couldn't figure out how.

Why and how does this happen?

I used gcc 13 as my compiler on a 64-bit machine, if it helps.

PS: I was kind of expecting an explanation in terms of what is allocated in memory in what order, instead of just calling it undefined behavior. The local variables are close to each other in terms of memory locations, afaik. Maybe the things written in the place of data affected the working of printing ctr in some way. but if UB is the furthest answer I can get, I also want to know why. Thanks for the UB explanations :)

like image 930
pnguib Avatar asked Oct 31 '25 14:10

pnguib


2 Answers

Yes, the values of uninitialized variables are indeterminate. They may be 0, but they might be any other value valid for the data type. If you are using %s with printf without a width limit, not supplying a null-terminated string leads to undefined behaviour.

Reading out of bounds on an array also invokes UB.

One of the most dangerous UB situations is when your program appears to work normally. This can lead to faulty assumptions about expected behaviour, and in my experience those expectations are always dashed at the worst possible times.

like image 81
Chris Avatar answered Nov 03 '25 05:11

Chris


%s tells printf() that the corresponding argument is to be treated as a string (a 0-terminated sequence of char), which ctr is not, resulting in an unexpected (undefined) outcome.

like image 41
Madagascar Avatar answered Nov 03 '25 05:11

Madagascar



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!