Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is stored in this 26KB executable?

Compiling this code with -O3:

#include <iostream>
int main(){std::cout<<"Hello World"<<std::endl;}

results in a file with a length of 25,890 bytes. (Compiled with GCC 4.8.1)

Can't the compiler just store two calls to write(STDOUT_FILENO, ???, strlen(???));, store write's contents, store the string, and boom write it to the disk? It should result in a EXE with a length under 1,024 bytes to my estimate.

Compiling a hello world program in assembly results in 17 bytes file: https://stackoverflow.com/questions/284797/hello-world-in-less-than-17-bytes, means actual code is 5-bytes long. (The string is Hello World\0)

What that EXE stores except the actual main and the functions it calls?

NOTE: This question applies to MSVC too.


Edit:
A lot of users pointed at iostream as being the culprit, so I tested this hypothesis and compiled this program with the same parameters:

int main( ) {
}

And got 23,815 bytes, the hypothesis has been disproved.

like image 746
LyingOnTheSky Avatar asked Jun 12 '15 13:06

LyingOnTheSky


1 Answers

The compiler generates by default a complete PE-conformant executable. Assuming a release build, the simple code you posted might probably include:

  • all the PE headers and tables needed by the loader (e.g. IAT), this also means alignment requirements have to be met
  • CRT library initialization code
  • Debugging info (you need to manually drop these off even for a release build)

In case the compiler were MSVC there would have been additional inclusions:

  • Manifest xml and relocation data
  • Results of default compiler options that favor speed over size

The link you posted does contain a very small assembly "hello world" program, but in order to properly run in a Windows environment at least the complete and valid PE structure needs to be available to the loader (setting aside all the low-level issues that might cause that code not to run at all).

Assuming the loader had already and correctly 'set up' the process where to run that code into, only at that point you could map it into a PE section and do

jmp small_hello_world_entry_point

to actually execute the code.

References: The PE format

One last notice: UPX and similar compression tools are also used to reduce filesize for executables.

like image 52
Marco A. Avatar answered Oct 20 '22 05:10

Marco A.