Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does GCC not assign the static variable when it is initialized to 0

I initialize a static variable to 0, but when I see the assembly code, I find that only memory is allocated to the variable. The value is not assigned
And when I initialize the static variable to other numbers, I can find that the memory is assigned a value.
I guess whether GCC thinks the memory should be initialized to 0 by OS before we use the memory.

The GCC option I use is "gcc -m32 -fno-stack-protector -c -o"

When I initialize the static variable to 0, the c code and the assembly code:

static int temp_front=0;
.local  temp_front.1909
.comm   temp_front.1909,4,4

When I initialize it to other numbers, the code is:

static int temp_front=1;
    .align 4
    .type   temp_front.1909, @object
    .size   temp_front.1909, 4
temp_front.1909:
    .long   1
like image 698
YD Zhou Avatar asked Aug 18 '19 10:08

YD Zhou


2 Answers

TL:DR: GCC knows the BSS is guaranteed to be zero-initialized on the platform it's targeting so it puts zero-initialized static data there.

Big picture

The program loader of most modern operating systems gets two different sizes for each part of the program, like the data part. The first size it gets is the size of data stored in the executable file (like a PE/COFF .EXE file on Windows or an ELF executable on Linux), while the second size is the size of the data part in memory while the program is running.

If the data size for the running program is bigger than the amount of data stored in the executable file, the remaining part of the data section is filled with bytes containing zero. In your program, the .comm line tells the linker to reserve 4 bytes without initializing them, so that the OS zero-initializes them on start.

What does gcc do?

gcc (or any other C compiler) allocates zero-initialized variables with static storage duration in the .bss section. Everything allocated in that section will be zero-initialized on program startup. For allocation, it uses the comm directive, and it just specifies the size (4 bytes).

You can see the size of the main section types (code, data, bss) using the size command. If you initialize the variable with one, it is included in a data section, and occupies 4 bytes there. If you initialize it with zero (or not at all), it is instead allocated in the .bss section.

What does ld do?

ld merges all data-type section of all object files (even those from static libraries) into one data section, followed by all .bss-type sections. The executable output contains a simplified view for the operating system's program loader. For ELF files, this is the "program header". You can take a look at it using objdump -p for any format, or readelf for ELF files.

The program headers contain of entries of different type. Among them are a couple of entries with the type PT_LOAD describing the "segments" to be loaded by the operating system. One of these PT_LOAD entries is for the data area (where the .data section is linked). It contains an entry called p_filesz that specifies how many bytes for initialized variables are provided in the ELF file, and an entry called p_memsz telling the loader how much space in the address space should be reserved. The details on which sections get merged into what PT_LOAD entries differ between linkers and depend on command line options, but generally you will find a PT_LOAD entry that describes a region that is both readable and writeable, but not executable, and has a p_filesz value that is smaller than the p_memsz entry (potentially zero if there's only a .bss, no .data section). p_filesz is the size of all read+write data sections, whereas p_memsz is bigger to also provide space for zero-initialized variables.

The amount p_memsz exceeds p_filesz is the sum of all .bss sections linked into the executable. (The values might be off a bit due to alignment to pages or disk blocks)

See chapter 5 in the System V ABI specification, especially pages 5-2 and 5-3 for a description of the program header entries.

What does the operating system do?

The Linux kernel (or another ELF-compliant kernel) iterates over all entries in the program header. For each entry containing the type PT_LOAD it allocates virtual address space. It associates the beginning of that address space with the corresponding region in the executable file, and if the space is writeable, it enables copy-on-write.

If p_memsz exceeds p_filesz, the kernel arranges the remaining address space to be completely zeroed out. So the variable that got allocated in the .bss section by gcc ends up in the "tail" of the read-write PT_LOAD entry in the ELF file, and the kernel provides the zero.

Any whole pages that have no backing data can start out copy-on-write mapped to a shared physical page of zeros.

like image 82
Michael Karcher Avatar answered Sep 24 '22 08:09

Michael Karcher


Why does GCC not assign ...

Most modern OSs will automatically zero-initialize the BSS section.

Using such an OS an "uninitialized" variable is identical to a variable that is initialized to zero.

However, there is one difference: The data of uninitialized variables are not stored in the resulting object and executable files; the data of initialized variables is.

This means that "real" zero-initialized variables may lead to a larger file size compared to uninitialized variables.

For this reason the compiler prefers using "uninitialized" variables if variables are really zero-initialized.

The GCC option I use is ...

Of course there are also operating systems which do not automatically initialize "uninitialized" memory to zero.

As far as I remember Windows 95 is an example for this.

If you want to compile for such an operating system, you may use the GCC command line option -fno-zero-initialized-in-bss. This command line option forces GCC to "really" zero-initialize variables that are zero-initialized.

I just compiled your code with that command line option; the output looks like this:

    .data
    .align 4
    .type     temp_front, @object
    .size     temp_front, 4
 temp_front:
    .zero  4
like image 37
Martin Rosenau Avatar answered Sep 25 '22 08:09

Martin Rosenau