Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is the executable so big? (Why isn't dead code removed?)

Tags:

c++

c

visual-c++

Compilng and linking this file results in a 1-KiB executable:

#pragma comment(linker, "/Entry:mainCRTStartup") // No CRT code (reduce size)
#pragma comment(linker, "/Subsystem:Console")    // Needed if avoiding CRT

#define STRINGIFIER(x)    func##x
#define STRINGIFY(x)      STRINGIFIER(x)
#define G   int STRINGIFY(__COUNTER__)(void) { return __COUNTER__; }

int mainCRTStartup(void) { return 0; }  // Does nothing

#if 0
    // Every `G' generates a new, unused function
    G G G G G G G G G G G G G G G G G G G G G G G G G G G G G G G G
    G G G G G G G G G G G G G G G G G G G G G G G G G G G G G G G G
#endif

When you change #if 0 to #if 1), the output size doubles to 2 KiB.

It seems to do this with all versions of Visual C++ to date, even though my command-line options contain all optimizations I could think of:

/Ox /MD /link /fixed /OPT:ICF /OPT:REF

and, specifically, I did not include any debugging information.

Does anyone know why /OPT:REF is not causing the linker to remove the unused functions?

like image 677
user541686 Avatar asked Feb 23 '12 23:02

user541686


1 Answers

In broad terms... the compiler generates code in "object records" that contains a bunch of assembly code and supporting information. The linker links these object records together to create an executable.

Often a compiler will create a single object record for an entire source file. In this case, the linker can only decide to link in the entire object record, or not. Since there is at least one function in the object record that is used, it must link in all of it.

On some compilers, you can tell it to generate a separate object record for each function (an object file can have multiple object records). In this case, the linker can make the decision to omit some of the object records if they're never called.

From the Microsoft documentation for /OPT:

/OPT:REF

LINK removes unreferenced packaged functions by default. An object contains packaged functions (COMDATs) if it has been compiled with the /Gy option. This optimization is called transitive COMDAT elimination. To override this default and keep unreferenced COMDATs in the program, specify /OPT:NOREF. You can use the /INCLUDE option to override the removal of a specific symbol.

The /Gy compiler option enables function-level linking.

For reference, this feature also exists in gcc:

-ffunction-sections
-fdata-sections

Place each function or data item into its own section in the output file if the target supports arbitrary sections. The name of the function or the name of the data item determines the section’s name in the output file.

Use these options on systems where the linker can perform optimizations to improve locality of reference in the instruction space. Most systems using the ELF object format and SPARC processors running Solaris 2 have linkers with such optimizations. AIX may have these optimizations in the future.

Only use these options when there are significant benefits from doing so. When you specify these options, the assembler and linker will create larger object and executable files and will also be slower. You will not be able to use "gprof" on all systems if you specify this option and you may have problems with debugging if you specify both this option and -g.

And the companion option in ld:

--gc-sections

Enable garbage collection of unused input sections. It is ignored on targets that do not support this option. This option is not compatible with -r or --emit-relocs. The default behaviour (of not performing this garbage collection) can be restored by specifying --no-gc-sections on the command line.

like image 133
Greg Hewgill Avatar answered Sep 20 '22 13:09

Greg Hewgill