Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why do common C compilers include the source filename in the output?

I have learnt from this recent answer that gcc and clang include the source filename somewhere in the binary as metadata, even when debugging is not enabled.

I can't really understand why this should be a good idea. Besides the tiny privacy risks, this happens also when one optimizes for the size of the resulting binary (-Os), which looks inefficient.

Why do the compilers include this information?

like image 680
Federico Poloni Avatar asked Sep 05 '15 12:09

Federico Poloni


2 Answers

The reason why GCC includes the filename is mainly for debugging purposes, because it allows a programmer to identify from which source file a given symbol comes from as (tersely) outlined in the ELF spec p1-17 and further expanded upon in some Oracle docs on linking.

An example of using the STT_FILE section is given by this SO question.

I'm still confused why both GCC and Clang still include it even if you specify -g0, but you can stop it from including STT_FILE with -s. I couldn't find any explanation for this, nor could I find an "official reason" why STT_FILE is included in the ELF specification (which is very terse).

like image 94
cyphar Avatar answered Sep 28 '22 07:09

cyphar


I have learnt from this recent answer that gcc includes the source filename somewhere in the binary as metadata, even when debugging is not enabled.

Not quite. In modern ELF object files the file name indeed is a symbol of type FILE:

$ readelf bignum.o    # Source bignum.c
[...]
Symbol table (.symtab) contains 36 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
     0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND
     1: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS bignum.c
     2: 0000000000000000     0 SECTION LOCAL  DEFAULT    1
     3: 0000000000000000     0 SECTION LOCAL  DEFAULT    3
     4: 0000000000000000     0 SECTION LOCAL  DEFAULT    4
     5: 0000000000000000     0 SECTION LOCAL  DEFAULT    5
     6: 0000000000000000     0 SECTION LOCAL  DEFAULT    6
     7: 0000000000000000     0 SECTION LOCAL  DEFAULT    7
     8: 0000000000000000     0 SECTION LOCAL  DEFAULT    8
     9: 00000000000003f0   172 FUNC    GLOBAL DEFAULT    1 add
    10: 00000000000004a0   104 FUNC    GLOBAL DEFAULT    1 copy

However, once stripped, the symbol is gone:

$ strip bignum.o
$ readelf -all bignum.o | grep bignum.c
$

So to keep your privacy, strip the executable, or compile/link with -s.

like image 31
Jens Avatar answered Sep 28 '22 08:09

Jens