Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What happens to a declared, uninitialized variable in C? Does it have a value?

If in C I write:

int num;

Before I assign anything to num, is the value of num indeterminate?

like image 730
atp Avatar asked Oct 20 '09 21:10

atp


People also ask

What is the value of uninitialized variable in C?

INTRODUCTION: An uninitialized variable has an undefined value, often corresponding to the data that was already in the particular memory location that the variable is using.

Is uninitialized variables null in C?

In C, variables with static storage duration that are not initialized explicitly are initialized to zero (or null, for pointers).

What happens if you don't initialize a variable?

If you don't initialize an variable that's defined inside a function, the variable value remain undefined. That means the element takes on whatever value previously resided at that location in memory.

What is the value in a variable declared but not initialized in C programming?

If a variable is declared but not initialized or uninitialized and if those variables are trying to print, then, it will return 0 or some garbage value. Whenever we declare a variable, a location is allocated to that variable.


5 Answers

Static variables (file scope and function static) are initialized to zero:

int x; // zero
int y = 0; // also zero

void foo() {
    static int x; // also zero
}

Non-static variables (local variables) are indeterminate. Reading them prior to assigning a value results in undefined behavior.

void foo() {
    int x;
    printf("%d", x); // the compiler is free to crash here
}

In practice, they tend to just have some nonsensical value in there initially - some compilers may even put in specific, fixed values to make it obvious when looking in a debugger - but strictly speaking, the compiler is free to do anything from crashing to summoning demons through your nasal passages.

As for why it's undefined behavior instead of simply "undefined/arbitrary value", there are a number of CPU architectures that have additional flag bits in their representation for various types. A modern example would be the Itanium, which has a "Not a Thing" bit in its registers; of course, the C standard drafters were considering some older architectures.

Attempting to work with a value with these flag bits set can result in a CPU exception in an operation that really shouldn't fail (eg, integer addition, or assigning to another variable). And if you go and leave a variable uninitialized, the compiler might pick up some random garbage with these flag bits set - meaning touching that uninitialized variable may be deadly.

like image 173
bdonlan Avatar answered Sep 27 '22 20:09

bdonlan


0 if static or global, indeterminate if storage class is auto

C has always been very specific about the initial values of objects. If global or static, they will be zeroed. If auto, the value is indeterminate.

This was the case in pre-C89 compilers and was so specified by K&R and in DMR's original C report.

This was the case in C89, see section 6.5.7 Initialization.

If an object that has automatic storage duration is not initialized explicitely, its value is indeterminate. If an object that has static storage duration is not initialized explicitely, it is initialized implicitely as if every member that has arithmetic type were assigned 0 and every member that has pointer type were assigned a null pointer constant.

This was the case in C99, see section 6.7.8 Initialization.

If an object that has automatic storage duration is not initialized explicitly, its value is indeterminate. If an object that has static storage duration is not initialized explicitly, then:
— if it has pointer type, it is initialized to a null pointer;
— if it has arithmetic type, it is initialized to (positive or unsigned) zero;
— if it is an aggregate, every member is initialized (recursively) according to these rules;
— if it is a union, the first named member is initialized (recursively) according to these rules.

As to what exactly indeterminate means, I'm not sure for C89, C99 says:

3.17.2
indeterminate value

either an unspecified value or a trap representation

But regardless of what standards say, in real life, each stack page actually does start off as zero, but when your program looks at any auto storage class values, it sees whatever was left behind by your own program when it last used those stack addresses. If you allocate a lot of auto arrays you will see them eventually start neatly with zeroes.

You might wonder, why is it this way? A different SO answer deals with that question, see: https://stackoverflow.com/a/2091505/140740

like image 25
DigitalRoss Avatar answered Sep 23 '22 20:09

DigitalRoss


It depends on the storage duration of the variable. A variable with static storage duration is always implicitly initialized with zero.

As for automatic (local) variables, an uninitialized variable has indeterminate value. Indeterminate value, among other things, mean that whatever "value" you might "see" in that variable is not only unpredictable, it is not even guaranteed to be stable. For example, in practice (i.e. ignoring the UB for a second) this code

int num;
int a = num;
int b = num;

does not guarantee that variables a and b will receive identical values. Interestingly, this is not some pedantic theoretical concept, this readily happens in practice as consequence of optimization.

So in general, the popular answer that "it is initialized with whatever garbage was in memory" is not even remotely correct. Uninitialized variable's behavior is different from that of a variable initialized with garbage.

like image 34
AnT Avatar answered Sep 24 '22 20:09

AnT


Ubuntu 15.10, Kernel 4.2.0, x86-64, GCC 5.2.1 example

Enough standards, let's look at an implementation :-)

Local variable

Standards: undefined behavior.

Implementation: the program allocates stack space, and never moves anything to that address, so whatever was there previously is used.

#include <stdio.h>
int main() {
    int i;
    printf("%d\n", i);
}

compile with:

gcc -O0 -std=c99 a.c

outputs:

0

and decompiles with:

objdump -dr a.out

to:

0000000000400536 <main>:
  400536:       55                      push   %rbp
  400537:       48 89 e5                mov    %rsp,%rbp
  40053a:       48 83 ec 10             sub    $0x10,%rsp
  40053e:       8b 45 fc                mov    -0x4(%rbp),%eax
  400541:       89 c6                   mov    %eax,%esi
  400543:       bf e4 05 40 00          mov    $0x4005e4,%edi
  400548:       b8 00 00 00 00          mov    $0x0,%eax
  40054d:       e8 be fe ff ff          callq  400410 <printf@plt>
  400552:       b8 00 00 00 00          mov    $0x0,%eax
  400557:       c9                      leaveq
  400558:       c3                      retq

From our knowledge of x86-64 calling conventions:

  • %rdi is the first printf argument, thus the string "%d\n" at address 0x4005e4

  • %rsi is the second printf argument, thus i.

    It comes from -0x4(%rbp), which is the first 4-byte local variable.

    At this point, rbp is in the first page of the stack has been allocated by the kernel, so to understand that value we would to look into the kernel code and find out what it sets that to.

    TODO does the kernel set that memory to something before reusing it for other processes when a process dies? If not, the new process would be able to read the memory of other finished programs, leaking data. See: Are uninitialized values ever a security risk?

We can then also play with our own stack modifications and write fun things like:

#include <assert.h>

int f() {
    int i = 13;
    return i;
}

int g() {
    int i;
    return i;
}

int main() {
    f();
    assert(g() == 13);
}

Note that GCC 11 seems to produce a different assembly output, and the above code stops "working", it is undefined behavior after all: Why does -O3 in gcc seem to initialize my local variable to 0, while -O0 does not?

Local variable in -O3

Implementation analysis at: What does <value optimized out> mean in gdb?

Global variables

Standards: 0

Implementation: .bss section.

#include <stdio.h>
int i;
int main() {
    printf("%d\n", i);
}

gcc -O0 -std=c99 a.c

compiles to:

0000000000400536 <main>:
  400536:       55                      push   %rbp
  400537:       48 89 e5                mov    %rsp,%rbp
  40053a:       8b 05 04 0b 20 00       mov    0x200b04(%rip),%eax        # 601044 <i>
  400540:       89 c6                   mov    %eax,%esi
  400542:       bf e4 05 40 00          mov    $0x4005e4,%edi
  400547:       b8 00 00 00 00          mov    $0x0,%eax
  40054c:       e8 bf fe ff ff          callq  400410 <printf@plt>
  400551:       b8 00 00 00 00          mov    $0x0,%eax
  400556:       5d                      pop    %rbp
  400557:       c3                      retq
  400558:       0f 1f 84 00 00 00 00    nopl   0x0(%rax,%rax,1)
  40055f:       00

# 601044 <i> says that i is at address 0x601044 and:

readelf -SW a.out

contains:

[25] .bss              NOBITS          0000000000601040 001040 000008 00  WA  0   0  4

which says 0x601044 is right in the middle of the .bss section, which starts at 0x601040 and is 8 bytes long.

The ELF standard then guarantees that the section named .bss is completely filled with of zeros:

.bss This section holds uninitialized data that contribute to the program’s memory image. By definition, the system initializes the data with zeros when the program begins to run. The section occu- pies no file space, as indicated by the section type, SHT_NOBITS.

Furthermore, the type SHT_NOBITS is efficient and occupies no space on the executable file:

sh_size This member gives the section’s size in bytes. Unless the sec- tion type is SHT_NOBITS , the section occupies sh_size bytes in the file. A section of type SHT_NOBITS may have a non-zero size, but it occupies no space in the file.

Then it is up to the Linux kernel to zero out that memory region when loading the program into memory when it gets started.


That depends. If that definition is global (outside any function) then num will be initialized to zero. If it's local (inside a function) then its value is indeterminate. In theory, even attempting to read the value has undefined behavior -- C allows for the possibility of bits that don't contribute to the value, but have to be set in specific ways for you to even get defined results from reading the variable.

like image 29
Jerry Coffin Avatar answered Sep 24 '22 20:09

Jerry Coffin