Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why are generated binaries so large?

Tags:

c++

size

binaries

Why are the binaries that are generated when I compile my C++ programs so large (as in easily 10 times the size of the source code files)? What advantages does this offer over interpreted languages for which such compilation is not necessary (and thus the program size is only the size of the code files)?

like image 559
wrongusername Avatar asked May 09 '11 06:05

wrongusername


People also ask

Why is binary so large?

in Go 1.2, a decision was made to pre-expand the line table in the executable file into its final format suitable for direct use at run-time, without an additional decompression step. In other words, the Go team decided to make executable files larger to save up on initialization time.

Why is EXE file so large?

An executable have been linked with other object files and libraries, needed for all external functions and variables you need. That of course makes the executable much bigger as it contains much more code.

Is compiled code smaller?

typically, compiled code is smaller than the source code it is compiled from.


1 Answers

Modern interpreted languages do typically compile the code to some manner of representation for faster execution... it might not get written out to disk, but there's certainly no guarantee that the program is represented in a more compact form. Some interpreters go the whole hog and generate machine code anyway (e.g. Java JIT). Then there's the interpreter itself sitting in memory which can be large.

A few points:

  • The more sophisticated the commands in the source code, the more machine code operations might be required to execute them. Thus, higher level language features tend to have a higher ratio of compiled-code to source code. That's not necessarily a bad thing: think of it as "I only have to say a little about what I want done and it infers all those necessary steps". The challenge in programming is to ensure they are necessary - that requires good library and program design.
  • The compiler often deliberately decides to trade some executable size for faster expected execution speed: inline vs out-of-line code is part of this compromise, though for small functions neither may be consistently more compact.
  • More sophisticated run-time environments (e.g. adding support for C++ exceptions) can involve a bit of extra code that runs when the program first starts to construct the necessary environment for that language feature.
  • Libraries feature may not be comparable. As well as the sort of add-on libraries you're very likely to have had to track down yourself and be very aware of using (e.g. XML, PDF parsing, OpenGL), languages often quietly use supporting libraries for what seem like language features and functions. Any of these can be suprisingly large.
    • For example, many interpreters just expose the C library's printf() statement or something similar, while for output formatting C++ has ostream - a more complex, extensible and type-safe system with (for better or worse) persistent state across function calls, routines to query and set that state, an additional layer of customisable buffering, customisable character types and localisation, and generally a lot of small inline functions that can lead to smaller or larger programs depending on the exact use and compiler settings. What's best depends on your application and memory vs performance goals.
  • Inbuilt language statements may be compiled differently: a switch on an integer expression and have 100 case labels spread randomly between 1 and 1000: one compiler/languages might decide to "pack" the 100 cases and do a binary search for a match, another to use a sparsely populated array of 1000 elements and do direct indexing (which wastes space in the executable but typically makes for faster code). So, it's hard to draw conclusions based on executable size.

Typically, memory usage and execution speed become increasingly important as the program gets larger and more complex. You don't see systems like Operating Systems, enterprise web servers or full-featured commercial word processors written in interpreted languages because they don't have the scalability.

like image 86
Tony Delroy Avatar answered Oct 06 '22 17:10

Tony Delroy