Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why are my C++ binary built with -LTO so very large?

I'm compiling some binaries on Mac, but the compiled size has become huge with more recent compiler (up to ~20MB from ~5MB before). I think it's related to LTO (link time optimization) that was not activated before. I do not observe this file bloat on linux.

After playing around with strip (practically no reduction in size, despite trying Xcode based with flags -S -x and also no flags, and GNU libtools strip porvided by homebrew binutils recipe with flag -s, all of these seem to have the same effect) I found this tool : https://github.com/google/bloaty Bloaty McBloated, when run on my binary it produces this output :

    FILE SIZE        VM SIZE    
 --------------  -------------- 
  53.9%  9.72Mi  53.8%  9.72Mi    __GNU_LTO,__wrapper_sects
  32.5%  5.86Mi  32.4%  5.86Mi    __GNU_DWARF_LTO,__debug_info
   6.2%  1.11Mi   6.2%  1.11Mi    __TEXT,__text
   2.2%   403Ki   2.2%   403Ki    __TEXT,__eh_frame
   1.6%   298Ki   1.6%   298Ki    __GNU_LTO,__wrapper_names
   1.0%   177Ki   1.0%   177Ki    Export Info
   0.7%   131Ki   0.7%   131Ki    Weak Binding Info
   0.4%  77.0Ki   0.4%  77.0Ki    __GNU_DWARF_LTO,__debug_str
   0.4%  75.8Ki   0.4%  75.8Ki    __DATA,__gcc_except_tab
   0.2%  44.6Ki   0.2%  44.6Ki    __GNU_LTO,__wrapper_index
   0.2%  39.4Ki   0.2%  39.4Ki    __DATA_CONST,__const
   0.2%  33.1Ki   0.2%  33.1Ki    __GNU_DWARF_LTO,__debug_abbrev
   0.1%  26.4Ki   0.1%  26.4Ki    __GNU_DWARF_LTO,__debug_line
   0.1%  21.7Ki   0.1%  23.6Ki    [20 Others]
   0.1%  19.0Ki   0.1%  19.0Ki    __TEXT,__text_cold
   0.1%  18.1Ki   0.1%  18.1Ki    __TEXT,__const
   0.0%  8.82Ki   0.0%  8.82Ki    __TEXT,__text_startup
   0.0%  8.60Ki   0.0%  8.60Ki    __TEXT,__cstring
   0.0%       0   0.0%  7.18Ki    __DATA,__pu_bss5
   0.0%       0   0.0%  6.88Ki    __DATA,__bss5
   0.0%  5.87Ki   0.0%  5.87Ki    __DATA,__la_symbol_ptr
 100.0%  18.1Mi 100.0%  18.1Mi    TOTAL

So can anyone tell me what these huge *_LTO sections are for, and how do I get rid of them, by post-processing or adding compilation flags to my build chain.

OS is MacOS, I'm using g++ 10, a full trace is here : https://github.com/yanntm/testGithbuActions/runs/1778387086?check_suite_focus=true

I'm trying to compile static as much as possible for better portability. The binary however is still dynamically linked to /usr/lib/libSystem.B.dylib (I can't statically link this one apparently with libtool).

I don't want any debug symbols as this is a production binary meant for end-users.

like image 481
Yann TM Avatar asked Jan 27 '21 18:01

Yann TM


People also ask

Why is compiled code larger?

Compiling with debug yields larger files. Executables have some overhead as they have instructions for the loader. Instructions are system-dependent, e.g, 64-bit will take up more room than 32-bit. Some languages are more verbose than others.

Why are executables so big?

One reason executables can be large is that portions of the C++ runtime library might get statically linked with your program.

What is split DWARF?

Split DWARF² makes this possible: It generates a separate file for the debug info which the linker can ignore. This file has the suffix . dwo (DWARF object file). DWARF is a debugging file format generally used on Unix.

What is a binary executable file created from the source code?

Source code files represent the computing language-specific data declarations and instructions that constitute a software module, routine, procedure, or class. A source code file is intended to be compiled into an executable binary file that can be run on the target computing system.


1 Answers

You will find the answer in gcc's documentation:

Link time optimization is implemented as a GCC front end for a bytecode representation of GIMPLE that is emitted in special sections of .o files.

[ ... ]

Since GIMPLE bytecode is saved alongside final object code, object files generated with LTO support are larger than regular object files.

[ ... ]

The current implementation only produces “fat” objects, effectively doubling compilation time and increasing file sizes up to 5x the original size.

But wait, there's more. You built only with -flto. Had you also used -ffat-lto-objects, then, as explained in gcc's info page:

'-ffat-lto-objects'

Fat LTO objects are object files that contain both the intermediate language and the object code. This makes them usable for both LTO linking and normal linking. This option is effective only when compiling with '-flto' and is ignored at link time.

Attempts to use strip will be in vain. strip only strips out debug data. This is not debug data, but, basically, halfway-compiled C++ code, with the final compilation happening as part of the link cycle. If you want to "get rid of them", don't use LTO.

EDIT: it's possible that some gcc/binutils configuration will leave LTO sections in the target binary. I looked into into Fedora's default rpmbuild configuration does, which builds with LTO by default but does not suffer from the same executable bloat.

It turns out that Fedora's rpmbuild executes a brp-strip-lto script that boils down to this:

sh -c "$STRIP -p -R .gnu.lto_* -R .gnu.debuglto_* -N __gnu_lto_v1 \"\$@\"" ARG0

The key options are the two -R options, it's unclear what the __gnu_lto_v1 symbol is, that gets removed by -N.

like image 69
Sam Varshavchik Avatar answered Oct 17 '22 07:10

Sam Varshavchik