I created a file containing the following line:
int main() { return 0; }
After compiling this, I was surprised to find out that the binary for this simple program is 8328 bytes! What is going on here, and what in the world is the binary doing in those 8328 bytes? Surely this program can be expressed in just a few lines of assembly.
Note: I compiled this with the following line:
g++ main.cpp
My g++ version is g++ (Ubuntu/Linaro 4.6.1-9ubuntu3) 4.6.1
There's a lot in that binary:
file
on it)strip
tool will remove for you (or link with gcc -s
)ldd
and strings
tools)argc
and argv
, then calls main
main
's return value to the operating system.For comic effect, try linking that program statically, where your binary will include the functions that would normally be dynamically linked to DLLs. (however, this option will simplify deployment)
Do a binary dump of the resulting file and check it out!
It's mostly empty space. Data in the binary are organized into pages (commonly, 4096 or 8192 bytes in size). That's so pages can be memory mapped efficiently. Typically the first page contains instructions on how to load the binary - code is at this position in the file and gets mapped to this location, same for data, etc. The second page will probably be your code, and the third page will contain symbols and debugging information. Each page is probably mostly empty.
Don't bother.
Try to make a less trivial program and you will discover the size is not that different, until your code will start to become various hundreds of kilobytes.
Briefly: There are part of the standard library that constitute the "infrastructure" between the OS modules and the C++ semantics that manage the startup and termination of the program (all that initialize and destroy the global variables, the standard input and output etc.)
Plus: everything that maps the C++ symbols towards the memory addresses, (if you didn't require to remove it - try the -O3
-s
and eliminate the -g
options) so that a debugger can show the proper source code references across the execution.
Also: because of the way the memory is laid out, a binary is normally made up by chunk of fixed size. Your program may even be shorter, but at least one code segment, one data segment initializer and one shared segment (for constant values) must be present.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With