I would really appreciate if somebody could tell me why compilation of this program: <pre class="prettyprint"><code>double data[123456789]; int main() {} </code></pre> takes 10 times longer then compilation of this one: <pre class="prettyprint"><code>int main() { double* data=new double[123456789]; } </code></pre> when both are compiled with: <pre class="prettyprint"><code>$ g++ -O0 </code></pre> and the executables are almost the same size. I am using gcc 4.4.3 on Ubuntu 10.04. Thank you.

Dynamic Allocation Your second program allocates memory at run-time; from the compiler's perspective, there is no real difference between compiling any of the following: <pre class="prettyprint"><code>double *data = new double[123456789]; double *data = malloc(123456789); double data = sqrt(123456789); </code></pre> They all do different things, but all the compiler needs to do is generate a call to an external function with a fixed argument. You can see this if you use <code>g++ -S</code> to generate assembly: <pre class="prettyprint"><code>.text main: subq $8, %rsp /* Allocate stack space. */ movl $987654312, %edi /* Load value "123456789 * 8" as argument. */ call _Znam /* Call the allocation function. */ xorl %eax, %eax /* Return 0. */ addq $8, %rsp /* Deallocate stack space. */ ret </code></pre> This is simple for any compiler to generate, and any linker to link. Static Allocation Your first program is a little tricker, though, as you have noticed. If we have a look at the assembly for it, we see something different going on: <pre class="prettyprint"><code>.text main: xorl %eax, %eax /* Return 0. */ ret .bss data: .zero 987654312 /* Reserve "123456789 * 8" bytes of space. */ </code></pre> The generated assembly asks for <code>123456789 * sizeof(double)</code> bytes of space to be reserved when the program is first started. When this is assembled and later linked (which happens behind the scenes is you just run <code>g++ foo.c</code>), the linker <code>ld</code> will actually allocate all of this reserved space in memory. This is where the time is going. If you run <code>top</code> while <code>g++</code> is running, you will see <code>ld</code> suck up a large amount of your system's memory. Reduced Executable Sizes A reasonable question might be "Why doesn't my executable end up really large, if the memory is being reserved at link-time?". The answer is hidden in the <code>.bss</code> marker in the assembly. This tells the linker that the data defined below it should not be stored in the final executable, but instead allocated to zero at run-time. This leaves us with the follow series of steps: <ol> <li>The assembler tells the linker that it needs to create a section of memory that is 1GB long.</li> <li>The linker goes ahead and allocates this memory, in preparation for placing it in the final executable.</li> <li>The linker realizes that this memory is in the <code>.bss</code> section and is marked <code>NOBITS</code>, meaning that the data is just 0, and doesn't need to be physically placed into the final executable. It avoids writing out the 1GB of data, instead just throwing the allocated memory away.</li> <li>The linker writes out to the final ELF file just the compiled code, producing a small executable.</li> </ol> A smarter linker might be able to avoid steps 2 and 3 above, making your compile time much faster. In reality, scenarios such as yours don't come up often enough in practice to make such an optimization worthwhile. Dynamic versus Static Allocation If you are trying to work out which of the above (dynamic versus static allocation) to actually use in your program, here are a few thoughts: <ul> <li>The linker will need to use as much memory as your final program (plus a bit). If you want to statically allocate 4GB of RAM, you will need 4GB of RAM for your linker. This isn't implicit in how linkers work, but rather just seems to be just the way they are implemented.</li> <li>When allocating large amounts of memory, doing it dynamically allows you to do better error handling (displaying a user-friendly message on screen explaining you don't have enough memory, instead of just failing to load the executable with a message from the OS going to the user);</li> <li>Dynamically allocating memory allows you to chose how much memory to allocate based on the your actual needs. (If <code>data</code> is a cache, the user can select the cache size, if it is storing intermediate results, you can size it based on the problem, etc.)</li> <li>Dynamically allocating memory allows you to free it later on, if your program needs to go on and do more work after it is finished with the memory.</li> </ul> In the end, if the above points don't matter and you can deal with longer compile times, it probably doesn't matter too much. Statically allocating memory can be much simpler, and is often the right approach for smaller amounts of memory or throw-away applications.

Long compilation time for program with static allocation

Tags:

gcc

I would really appreciate if somebody could tell me why compilation of this program:

double data[123456789];  
int main() {}

takes 10 times longer then compilation of this one:

int main() {
    double* data=new double[123456789];
}

when both are compiled with:

$ g++ -O0

and the executables are almost the same size.

I am using gcc 4.4.3 on Ubuntu 10.04.

Thank you.

244

asked Feb 12 '11 14:02

Predrag

1 Answers

Dynamic Allocation

Your second program allocates memory at run-time; from the compiler's perspective, there is no real difference between compiling any of the following:

double *data = new double[123456789];
double *data = malloc(123456789);
double data  = sqrt(123456789);

They all do different things, but all the compiler needs to do is generate a call to an external function with a fixed argument. You can see this if you use g++ -S to generate assembly:

.text
main:
    subq    $8, %rsp          /* Allocate stack space. */
    movl    $987654312, %edi  /* Load value "123456789 * 8" as argument. */
    call    _Znam             /* Call the allocation function. */
    xorl    %eax, %eax        /* Return 0. */
    addq    $8, %rsp          /* Deallocate stack space. */
    ret

This is simple for any compiler to generate, and any linker to link.

Static Allocation

Your first program is a little tricker, though, as you have noticed. If we have a look at the assembly for it, we see something different going on:

.text
main:
    xorl    %eax, %eax        /* Return 0. */
    ret

.bss
data:
    .zero   987654312         /* Reserve "123456789 * 8" bytes of space. */

The generated assembly asks for 123456789 * sizeof(double) bytes of space to be reserved when the program is first started. When this is assembled and later linked (which happens behind the scenes is you just run g++ foo.c), the linker ld will actually allocate all of this reserved space in memory. This is where the time is going. If you run top while g++ is running, you will see ld suck up a large amount of your system's memory.

Reduced Executable Sizes

A reasonable question might be "Why doesn't my executable end up really large, if the memory is being reserved at link-time?". The answer is hidden in the .bss marker in the assembly. This tells the linker that the data defined below it should not be stored in the final executable, but instead allocated to zero at run-time.

This leaves us with the follow series of steps:

The assembler tells the linker that it needs to create a section of memory that is 1GB long.
The linker goes ahead and allocates this memory, in preparation for placing it in the final executable.
The linker realizes that this memory is in the .bss section and is marked NOBITS, meaning that the data is just 0, and doesn't need to be physically placed into the final executable. It avoids writing out the 1GB of data, instead just throwing the allocated memory away.
The linker writes out to the final ELF file just the compiled code, producing a small executable.

A smarter linker might be able to avoid steps 2 and 3 above, making your compile time much faster. In reality, scenarios such as yours don't come up often enough in practice to make such an optimization worthwhile.

Dynamic versus Static Allocation

If you are trying to work out which of the above (dynamic versus static allocation) to actually use in your program, here are a few thoughts:

The linker will need to use as much memory as your final program (plus a bit). If you want to statically allocate 4GB of RAM, you will need 4GB of RAM for your linker. This isn't implicit in how linkers work, but rather just seems to be just the way they are implemented.
When allocating large amounts of memory, doing it dynamically allows you to do better error handling (displaying a user-friendly message on screen explaining you don't have enough memory, instead of just failing to load the executable with a message from the OS going to the user);
Dynamically allocating memory allows you to chose how much memory to allocate based on the your actual needs. (If data is a cache, the user can select the cache size, if it is storing intermediate results, you can size it based on the problem, etc.)
Dynamically allocating memory allows you to free it later on, if your program needs to go on and do more work after it is finished with the memory.

In the end, if the above points don't matter and you can deal with longer compile times, it probably doesn't matter too much. Statically allocating memory can be much simpler, and is often the right approach for smaller amounts of memory or throw-away applications.

187

answered Oct 19 '22 13:10

davidg

Related questions
                            
                                How does gcc find as, ld and other binutils executables?
                            
                                How can I run GCC/Clang for static analysis? (warnings only)
                            
                                Why does std::async copy its const & arguments?
                            
                                Inline static data causes a section type conflict
                            
                                Why does GCC need extra declarations in templates when VS does not?
                            
                                Effects of __attribute__((packed)) on nested array of structures?
                            
                                How can I deploy a C++11 program (with dependencies) on CentOS 6, whose GCC is C++03?
                            
                                How to include all objects of an archive in a shared object?
                            
                                return a scope in C++
                            
                                Speed of accessing local vs. global variables in gcc/g++ at different optimization levels
                            
                                Why would a compiler generate this assembly?
                            
                                static struct initialization in c99
                            
                                __STDC_LIB_EXT1__ availability in gcc and clang
                            
                                Weird uninitialized const member behavior
                            
                                Converting typeof to string
                            
                                How to cross compile from Mac OS X to Linux x86?
                            
                                Adding a Preprocessor Definition via a Xcode Scheme
                            
                                What is the difference in gcc between lto and fat-lto-objects
                            
                                Why am I able to perform floating point operations inside a Linux kernel module?
                            
                                "Assume" clause in gcc

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With