Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

When and why would the C linker exclude unused symbols?

Tags:

c

gcc

linker

I'm performing some tests with gcc to understand the rule(s) by which it intelligently excludes unused symbols.

// main.c

#include <stdio.h>

void foo()
{
}

int main( int argc, char* argv[] )
{
  return 0;
}

.

// bar.c

int bar()
{
  return 42;
}

.

> gcc --version
gcc (GCC) 8.2.1 20181215 (Red Hat 8.2.1-6)
Copyright (C) 2018 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
>
> gcc -c bar.c
> gcc -g main.c bar.o 
> nm a.out | grep "foo\|bar"
000000000040111f T bar
0000000000401106 T foo

Above, I've compiled bar.o, and linked it with a.out while compiling main.c.
Listing a.out's symbols show that both unused functions - foo() and bar() - are included in the executable.

> ar -r libbar.a bar.o
ar: creating libbar.a
> gcc -g main.c -L ./ -lbar
> nm a.out | grep "foo\|bar"
0000000000401106 T foo

Above, I've archived bar.o to libbar.a, and recreated a.out, this time linking with libbar.a instead of bar.o. This time around, unused function foo() is still present, but bar() is not.

From this experiment, I might surmise the following "rules":

  1. Symbols linked from object files are always present in executables. (Perhaps this explains why foo() is always present: is there a temporary/anonymous main.o that's created? If so, it would include foo())
  2. If an executable is linked with a library, gcc will intelligently figure out unnecessary symbols to exclude.

The above are my hypotheses based on this experiment - but how correct is it? If someone is knowledgeable with the intricacies of how linking works, I'd be grateful for some background information explaining the whys and wherefores of what's going on.

like image 444
StoneThrow Avatar asked Mar 12 '19 21:03

StoneThrow


People also ask

Does the linker remove unused functions?

So the linker is able to remove each individual function because it is in its own section. So enabling this for your library will allow the linker to remove unused functions from the library.

Do unused functions get compiled?

No: for unused globally available functions. The compiler doesn't know if some other compilation unit references it. Also, most object module types do not allow functions to be removed after compilation and also do not provide a way for the linker to tell if there exist internal references.

What is -- GC sections?

--gc-sections decides which input sections are used by examining symbols and relocations. The section containing the entry symbol and all sections containing symbols undefined on the command-line will be kept, as will sections containing symbols referenced by dynamic objects.


1 Answers

It's mostly correct with the caveat that static-library linking doesn't really have per-symbol granularity. It has per-member-object-file granularity.

Example:

If the static library contains files:

a.o 
    foo
    bar
b.o 
    baz

and an undefined reference to foo needs to be resolved, a.o will be brought in, and with it the bar symbol as well.

You can get the effect of per symbol granularity when you compile with -ffunction-sections -fdata-sections and then link with -Wl,--gc-sections (gc stands for garbage-collect), but bear in mind that the compiler/linker options are gcc/clang-specific and that they have some minor performance/code-size cost.

-ffunction-sections puts each function in its own section (sort of like its own object file) and -fdata-sections does the same thing for externally visible global variables. -Wl,--gc-sections then causes a garbage collector to run after the object files are linked as usual, and the garbage collector removes all sections (=>symbols) that are unreachable.

(-ffunction-sections is also useful if you want size -A the_objectfile.o to give you function sizes and if you also want those functions sizes to not slightly fluctuate based on the position of the functions (due to alignment requirements).)

like image 84
PSkocik Avatar answered Oct 05 '22 23:10

PSkocik