Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why do C and C++ compilers place explicitly initialized and default initialized global variables in different segments?

I was reading this great post about memory layout of C programs. It says that default initialized global variables resides in the BSS segment, and if you explicitly provide a value to a global variable then it will reside in the data segment.

I've tested the following programs in C and C++ to examine this behaviour.

#include <iostream>
// Both i and s are having static storage duration
int i;     // i will be kept in the BSS segment, default initialized variable, default value=0
int s(5);  // s will be kept in the data segment, explicitly initialized variable,
int main()
{
    std::cout<<&i<<' '<<&s;
}

Output:

0x488020 0x478004

So, from the output it clearly looks like both variable i & s resides in completely different segments. But if I remove the initializer (initial value 5 in this program) from the variable S and then run the program, it gives me the below output.

Output:

0x488020 0x488024

So, from the output it clearly looks like both variables i and s resides in the same (in this case BSS) segment.

This behaviour is also the same in C.

#include <stdio.h>
int i;      // i will be kept in the BSS segment, default initialized variable, default value=0
int s=5;    // s will be kept in the data segment, explicitly initialized variable,
int main(void)
{
    printf("%p %p\n",(void*)&i,(void*)&s);
}

Output:

004053D0 00403004

So, again we can say by looking at the output (means examining the address of variables), both variable i and s resides in completely different segments. But again if I remove the initializer (initial value 5 in this program) from the variable S and then run the program it gives me the below output.

Output:

004053D0 004053D4

So, from the output it clearly looks like both variables i and s resides in the same (in this case BSS) segment.

Why do C and C++ compilers place explicitly initialized and default initialized global variables in different segments? Why is there a distinction about where the global variable resides between default initialized and explicitly initialized variables? If I am not wrong, the C and C++ standards never talk about the stack, heap, data segment, code segment, BSS segment and all such things which are implementation-specific. So, is it possible for a C++ implementation to store explicitly initialized and default initialized variables in the same segments instead of keeping it in different segments?

like image 377
Destructor Avatar asked Dec 19 '15 05:12

Destructor


People also ask

Does C initialize global variables?

In C language both the global and static variables must be initialized with constant values. This is because the values of these variables must be known before the execution starts. An error will be generated if the constant values are not provided for global and static variables.

What is default value of global variable in C?

Thus global and static variables have '0' as their default values. Whereas auto variables are stored on the stack, and they do not have a fixed memory location.

Are global variables static by default in C?

Global and static variables are initialized to their default values because it is in the C or C++ standards and it is free to assign a value by zero at compile time. Both static and global variable behave same to the generated object code.

Why do we initialize and uninitialized data segment?

The initialized data segment contains global and static variables that are explicitly initialized. The values of these variables are read from the executable file when the program is loaded into memory. The uninitialized data segment contains global and static variables that are not explicitly initialized.


3 Answers

Neither language C or C++ has any notion of "segments", and not all OSs do either, so your question is inevitably dependent on the platform and compiler.

That said, common implementations will treat initialized vs. uninitialized variables differently. The main difference is that uninitialized (or default 0-initialized) data does not have to be actually saved with the compiled module, but only declared/reserved for later use at run time. In practical "segment" terms, initialized data is saved to disk as part of the binary, while uninitialized data is not, instead it's allocated at startup to satisfy the declared "reservations".

like image 80
dxiv Avatar answered Oct 19 '22 06:10

dxiv


The really short answer is "because it takes up less space". (As noted by others, the compiler doesn't have to do this!)

In the executable file, the data section will contain data that has its value store in the relative place. This means for every byte of initialized data, that data section contains one byte.

For zero-initialized globals, there is no reason to store a lot of zeros. Instead, just store the size of the whole set of data in one single size-value. So instead of storing 4132 bytes of zero in the data seciton, there is just a "BSS is 4132 bytes long" - and it's up to the OS/runtime to set up so that it is zero. - in some cases, the runtime of the compiler will memset(BSSStart, 0, BSSSize) or similar. In for example Linux, all "unused" memory is filled with zero anyway when the process is created, so setting BSS to zero is just a matter of allocating the memory in the first place.

And of course, shorter executable files have several benefits: Less space taken up on your hard-disk, faster loading time [extra bonus if the OS pre-fills the allocated memory with zero], faster compile time as the compiler/linker doesn't have to write the data to disk.

So there is an entirely practical reason for this.

like image 30
Mats Petersson Avatar answered Oct 19 '22 07:10

Mats Petersson


By definition, BSS is not a different segment, it is a part of data-segment.

In C and C++, statically-allocated objects without an explicit initializer are initialized to zero, an implementation may also assign statically-allocated variables and constants initialized with a value consisting solely of zero-valued bits to the BSS section.

A reason to store them in BSS is, those types of variables with uninitialized or default values can be obtained in run-time without wasting space in the binary files rather than the variables which are placed in data-segment.

like image 11
masoud Avatar answered Oct 19 '22 08:10

masoud