Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Global const optimization and symbol interposition

I was experimenting with gcc and clang to see if they can optimize

#define SCOPE static
SCOPE const struct wrap_ { const int x; } ptr = { 42 /*==0x2a*/ };
SCOPE struct wrap { const struct wrap_ *ptr; } const w = { &ptr };
int ret_global(void) { return w.ptr->x; }

to return an intermediate constant.

It turns out they can:

0000000000000010 <ret_global>:
   10:  b8 2a 00 00 00          mov    $0x2a,%eax
   15:  c3                      retq   

but surprisingly, removing the static yields the same assembly output. That got me curious because if the global isn't static it should be interposable and replacing the reference with an intermediate should prevent inerposition on the global variable.

And indeed it does:

#!/bin/sh -eu
: ${CC:=gcc}
cat > lib.c <<EOF
int ret_42(void) { return 42; }

#define SCOPE 
SCOPE const struct wrap_ { const int x; } ptr = { 42 };
SCOPE struct wrap { const struct wrap_ *ptr; } const w = { &ptr };
int ret_global(void) { return w.ptr->x; }
int ret_fn_result(void) { return ret_42()+1; }
EOF

cat > lib_override.c <<EOF
int ret_42(void) { return 50; }

#define SCOPE
 SCOPE const struct wrap_ { const int x; } ptr = { 60 };
SCOPE struct wrap { const struct wrap_ *ptr; } const w = { &ptr };
EOF

cat > main.c <<EOF
#include <stdio.h>
int ret_42(void), ret_global(void), ret_fn_result(void);
struct wrap_ { const int x; };
extern struct wrap { const struct wrap_ *ptr; } const w;
int main(void)
{
    printf("ret_42()=%d\n", ret_42());
    printf("ret_fn_result()=%d\n", ret_fn_result());
    printf("ret_global()=%d\n", ret_global());
    printf("w.ptr->x=%d\n",w.ptr->x);
}
EOF
for c in *.c; do
    $CC -fpic -O2 $c -c
    #$CC -fpic -O2 $c -c -fno-semantic-interposition
done
$CC lib.o -o lib.so -shared
$CC lib_override.o -o lib_override.so -shared
$CC main.o $PWD/lib.so
export LD_LIBRARY_PATH=$PWD
./a.out
LD_PRELOAD=$PWD/lib_override.so ./a.out

outputs

ret_42()=42
ret_fn_result()=43
ret_global()=42
w.ptr->x=42
ret_42()=50
ret_fn_result()=51
ret_global()=42
w.ptr->x=60

Is it OK for the compiler to replace refs to extern global variables with intermediates? Shouldn't those be interposable as well?


Edit:

Gcc does not optimize out external function calls (unless compiled with -fno-semantic-interposition)
such as the call to ret_42() in int ret_fn_result(void) { return ret_42()+1; }, even though, as with a reference to an extern global const variable, the only way for the definition of the symbol to change is through interposition.

  0000000000000020 <ret_fn_result>:
  20:   48 83 ec 08             sub    $0x8,%rsp
  24:   e8 00 00 00 00          callq  29 <ret_fn_result+0x9>
  29:   48 83 c4 08             add    $0x8,%rsp
  2d:   83 c0 01                add    $0x1,%eax

I always assumed this was to allow for the possibility of symbol interposition. Incidentally, clang does optimize them.

I wonder where (if anywhere) it says that the reference to extern const w in ret_global() can be optimized to an intermediate while the call to ret_42() in ret_fn_result cannot.

Anyway, it seems that symbol iterposition is awfully inconsistent and unreliable across different compilers unless you establish translation unit boundaries. :/ (Would be nice if simply all globals were consistently interposable unless -fno-semantic-interposition is on, but one can only wish.)

like image 360
PSkocik Avatar asked Oct 27 '18 11:10

PSkocik


3 Answers

According to What is the LD_PRELOAD trick? , LD_PRELOAD is an environment variable that allow users to load a library before any other library is loaded, including libc.so.

From this definition, it means 2 things:

  1. The library specified in LD_PRELOAD can overload symbols from other library.

  2. However, if that library specified does not contain the symbol, others library will be searched for that symbol as usual.

Here you specified LD_PRELOAD as lib_override.so, it defines int ret_42(void) and global variable ptr and w, but it does not define int ret_global(void).

So int ret_global(void) will be loaded from lib.so, and this function will directly returns 42 because the compiler sees no possibility that ptr and w from lib.c can be modified at runtime(they will be put int const data section in elf, linux guarantee that they can not be modified at runtime by hardware memory protection), so the compiler optimized that to return 42 directly.

Edit -- a test:

So I did some modification to your script:

#!/bin/sh -eu
: ${CC:=gcc}
cat > lib.c <<EOF
int ret_42(void) { return 42; }

#define SCOPE
SCOPE const struct wrap_ { const int x; } ptr = { 42 };
SCOPE struct wrap { const struct wrap_ *ptr; } const w = { &ptr };
int ret_global(void) { return w.ptr->x; }
EOF

cat > lib_override.c <<EOF
int ret_42(void) { return 50; }

#define SCOPE
 SCOPE const struct wrap_ { const int x; } ptr = { 60 };
SCOPE struct wrap { const struct wrap_ *ptr; } const w = { &ptr };
int ret_global(void) { return w.ptr->x; }
EOF

cat > main.c <<EOF
#include <stdio.h>
int ret_42(void), ret_global(void);
struct wrap_ { const int x; };
extern struct wrap { const struct wrap_ *ptr; } const w;
int main(void)
{
    printf("ret_42()=%d\n", ret_42());
    printf("ret_global()=%d\n", ret_global());
    printf("w.ptr->x=%d\n",w.ptr->x);
}
EOF
for c in *.c; do gcc -fpic -O2 $c -c; done
$CC lib.o -o lib.so -shared
$CC lib_override.o -o lib_override.so -shared
$CC main.o $PWD/lib.so
export LD_LIBRARY_PATH=$PWD
./a.out
LD_PRELOAD=$PWD/lib_override.so ./a.out

And this time, it prints:

ret_42()=42
ret_global()=42
w.ptr->x=42
ret_42()=50
ret_global()=60
w.ptr->x=60

Edit -- conclusion:

So it turns out that you either overload all related parts or overload nothing, otherwise you will get such tricky behavior. Another approach is to define int ret_global(void) in the header, not in the dynamic library, so you won't have to worry about that when you tries to overload some functionalities to do some tests.

Edit -- an explanation of why int ret_global(void) is overloadable and ptr and w is not.

First, I want to point out the type of symbols defined by you(using techniques from How do I list the symbols in a .so file :

File lib.so:

Symbol table '.dynsym' contains 13 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
     5: 0000000000001110     6 FUNC    GLOBAL DEFAULT   12 ret_global
     6: 0000000000001120    17 FUNC    GLOBAL DEFAULT   12 ret_fn_result
     7: 000000000000114c     0 FUNC    GLOBAL DEFAULT   14 _fini
     8: 0000000000001100     6 FUNC    GLOBAL DEFAULT   12 ret_42
     9: 0000000000000200     4 OBJECT  GLOBAL DEFAULT    1 ptr
    10: 0000000000003018     8 OBJECT  GLOBAL DEFAULT   22 w

Symbol table '.symtab' contains 28 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
    23: 0000000000001100     6 FUNC    GLOBAL DEFAULT   12 ret_42
    24: 0000000000001110     6 FUNC    GLOBAL DEFAULT   12 ret_global
    25: 0000000000001120    17 FUNC    GLOBAL DEFAULT   12 ret_fn_result
    26: 0000000000003018     8 OBJECT  GLOBAL DEFAULT   22 w
    27: 0000000000000200     4 OBJECT  GLOBAL DEFAULT    1 ptr

File lib_override.so:

Symbol table '.dynsym' contains 11 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
     6: 0000000000001100     6 FUNC    GLOBAL DEFAULT   12 ret_42
     7: 0000000000000200     4 OBJECT  GLOBAL DEFAULT    1 ptr
     8: 0000000000001108     0 FUNC    GLOBAL DEFAULT   13 _init
     9: 0000000000001120     0 FUNC    GLOBAL DEFAULT   14 _fini
    10: 0000000000003018     8 OBJECT  GLOBAL DEFAULT   22 w

Symbol table '.symtab' contains 26 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
    23: 0000000000001100     6 FUNC    GLOBAL DEFAULT   12 ret_42
    24: 0000000000003018     8 OBJECT  GLOBAL DEFAULT   22 w
    25: 0000000000000200     4 OBJECT  GLOBAL DEFAULT    1 ptr

You will find that despite both being GLOBAL symbol, all functions is marked as type FUNC which is overloadable, while all variables has type OBJECT. Type OBJECT means it is not overloadable, so compiler doesn't need to use symbol resolution to get the data.

For further information on this, check this: What Are "Tentative" Symbols? .

like image 194
JiaHao Xu Avatar answered Nov 07 '22 19:11

JiaHao Xu


You can use LD_DEBUG=bindings to trace symbol binding. In this case, it prints (among other things):

 17570: binding file /tmp/lib.so [0] to /tmp/lib_override.so [0]: normal symbol `ptr'
 17570: binding file /tmp/lib_override.so [0] to /tmp/lib_override.so [0]: normal symbol `ptr'
 17570: binding file ./a.out [0] to /tmp/lib_override.so [0]: normal symbol `ret_42'
 17570: binding file ./a.out [0] to /tmp/lib_override.so [0]: normal symbol `ret_global'

So the ptr object in lib.so is indeed interposed, but the main program never calls ret_global in the original library. The call goes to ret_global from the preloaded library because the function is interposed as well.

like image 1
Florian Weimer Avatar answered Nov 07 '22 18:11

Florian Weimer


EDIT: Question: I wonder where (if anywhere) it says that the reference to extern const w in ret_global() can be optimized to an intermediate while the call to ret_42() in ret_fn_result cannot.

TLDR; Logic behind this behavior (at least for GCC)

  • Compiler constant folding optimization capable of inlining complex const variables and structures

  • Compiler default behavior for functions is to export. If -fvisibility=hidden flag is not used, all functions are exported. Because any defined function is exported, it cannot be inlined. So call to ret_42 in ret_fn_result cannot be inlined. Turn on -fvisibility=hidden, the result will be as below.

  • Let's say that, if it would be possible to export and inline function for optimization purposes at the same time, it would lead to linker creating code that sometimes work in one way (inlined), some times works overriden (interposition), some times works straight in the scope of single loading and execution of resulting executable.

  • There are other flags that are in effect for this subject. Most notables:

    • -Bsymbolic, -Bsymbolic-functions and --dynamic-list as per SO.

    • -fno-semantic-interposition

    • of course optimization flags

Function ret_fn_result when ret_42 is hidden, not exported then inlined.

0000000000001110 <ret_fn_result>:
    1110:   b8 2b 00 00 00          mov    $0x2b,%eax
    1115:   c3                      retq   

Technicals

STEP #1, subject is defined in lib.c:

SCOPE const struct wrap_ { const int x; } ptr = { 42 };
SCOPE struct wrap { const struct wrap_ *ptr; } const w = { &ptr };
int ret_global(void) { return w.ptr->x; }

When lib.c is compiled, w.ptr->x is optimized to const. So, with constant folding, it results in:

$ object -T lib.so
lib.so:     file format elf64-x86-64

DYNAMIC SYMBOL TABLE:
0000000000000000  w   D  *UND*  0000000000000000              _ITM_deregisterTMCloneTable
0000000000000000  w   D  *UND*  0000000000000000              __gmon_start__
0000000000000000  w   D  *UND*  0000000000000000              _ITM_registerTMCloneTable
0000000000000000  w   DF *UND*  0000000000000000  GLIBC_2.2.5 __cxa_finalize
0000000000001110 g    DF .text  0000000000000006  Base        ret_42
0000000000002000 g    DO .rodata    0000000000000004  Base        ptr
0000000000001120 g    DF .text  0000000000000006  Base        ret_global
0000000000001130 g    DF .text  0000000000000011  Base        ret_fn_result
0000000000003e18 g    DO .data.rel.ro   0000000000000008  Base        w

Where ptr and w is put to rodata and data.rel.ro (because const pointer) respectively. Constant folding results in following code:

0000000000001120 <ret_global>:
    1120:   b8 2a 00 00 00          mov    $0x2a,%eax
    1125:   c3                      retq   

Another part is:

int ret_42(void) { return 42; }
int ret_fn_result(void) { return ret_42()+1; }

Here ret_42 is a function, since not hidden, it is exported function. So it is a code. And both are resulting in:

0000000000001110 <ret_42>:
    1110:   b8 2a 00 00 00          mov    $0x2a,%eax
    1115:   c3                      retq   

0000000000001130 <ret_fn_result>:
    1130:   48 83 ec 08             sub    $0x8,%rsp
    1134:   e8 f7 fe ff ff          callq  1030 <ret_42@plt>
    1139:   48 83 c4 08             add    $0x8,%rsp
    113d:   83 c0 01                add    $0x1,%eax
    1140:   c3                      retq   

Considering, that compiler does know only lib.c, we are done. Put lib.so aside.

STEP #2, compile lib_override.c:

int ret_42(void) { return 50; }

#define SCOPE
SCOPE const struct wrap_ { const int x; } ptr = { 60 };
SCOPE struct wrap { const struct wrap_ *ptr; } const w = { &ptr };

Which is simple:

$ objdump -T lib_override.so
lib_override.so:     file format elf64-x86-64

DYNAMIC SYMBOL TABLE:
0000000000000000  w   D  *UND*  0000000000000000              _ITM_deregisterTMCloneTable
0000000000000000  w   D  *UND*  0000000000000000              __gmon_start__
0000000000000000  w   D  *UND*  0000000000000000              _ITM_registerTMCloneTable
0000000000000000  w   DF *UND*  0000000000000000  GLIBC_2.2.5 __cxa_finalize
00000000000010f0 g    DF .text  0000000000000006  Base        ret_42
0000000000002000 g    DO .rodata    0000000000000004  Base        ptr
0000000000003e58 g    DO .data.rel.ro   0000000000000008  Base        w

Exported function ret_42, and then ptr and w is put to rodata and data.rel.ro (because const pointer) respectively. Constant folding results in following code:

00000000000010f0 <ret_42>:
    10f0:   b8 32 00 00 00          mov    $0x32,%eax
    10f5:   c3                      retq

STEP 3, compile main.c, let's see object first:

$ objdump -t main.o

# SKIPPED

0000000000000000         *UND*  0000000000000000 _GLOBAL_OFFSET_TABLE_
0000000000000000         *UND*  0000000000000000 ret_42
0000000000000000         *UND*  0000000000000000 printf
0000000000000000         *UND*  0000000000000000 ret_fn_result
0000000000000000         *UND*  0000000000000000 ret_global
0000000000000000         *UND*  0000000000000000 w

We have all symbols undefined. So they have to come from somewhere.

Then we link by default with lib.so and code is (printf and others are omitted):

0000000000001070 <main>:
    1074:   e8 c7 ff ff ff          callq  1040 <ret_42@plt>
    1089:   e8 c2 ff ff ff          callq  1050 <ret_fn_result@plt>
    109e:   e8 bd ff ff ff          callq  1060 <ret_global@plt>
    10b3:   48 8b 05 2e 2f 00 00    mov    0x2f2e(%rip),%rax        # 3fe8 <w>

Now we have lib.so, lib_override.so and a.out in hands.

Let's simply call a.out:

  1. main => ret_42 => lib.so => ret_42 => return 42
  2. main => ret_fn_result => lib.so => ret_fn_result => return ( lib.so => ret_42 => return 42 ) + 1
  3. main => ret_global => lib.so => ret_global => return rodata 42
  4. main => lib.so => w.ptr->x = rodata 42

Now let's preload with lib_override.so:

  1. main => ret_42 => lib_override.so => ret_42 => return 50
  2. main => ret_fn_result => lib.so => ret_fn_result => return ( lib_override.so => ret_42 => return 50 ) + 1
  3. main => ret_global => lib.so => ret_global => return rodata 42
  4. main => lib_override.so => w.ptr->x = rodata 60

For 1: main calls ret_42 from lib_override.so because it is preloaded, ret_42 now resolves to one in lib_override.so.

For 2: main calls ret_fn_result from lib.so which calls ret_42 but from lib_override.so, because it now resolves to one in lib_override.so.

For 3: main calls ret_global from lib.so which returns folded constant 42.

For 4: main reads extern pointer which is pointing to lib_override.so, because it is preloaded.

Finally, once lib.so is generated with folded constants which are inlined, one can't demand them to be "overrideable". If intention to have overrideable data structure, one should define it in some other way (provide functions to manipulate them, don't use constants etc.). Because when defining something as constant, intention is clear, and compiler does what it does. Then even if that same symbol is defined as not constant in main.c or other place, it cannot be unfolded back in lib.c.


#!/bin/sh -eu
: ${CC:=gcc}
cat > lib.c <<EOF
int ret_42(void) { return 42; }

#define SCOPE 
SCOPE const struct wrap_ { const int x; } ptr = { 42 };
SCOPE struct wrap { const struct wrap_ *ptr; } const w = { &ptr };
int ret_global(void) { return w.ptr->x; }
int ret_fn_result(void) { return ret_42()+1; }
EOF

cat > lib_override.c <<EOF
int ret_42(void) { return 50; }

#define SCOPE
 SCOPE const struct wrap_ { const int x; } ptr = { 60 };
SCOPE struct wrap { const struct wrap_ *ptr; } const w = { &ptr };
EOF

cat > main.c <<EOF
#include <stdio.h>
int ret_42(void), ret_global(void), ret_fn_result(void);
struct wrap_ { const int x; };
extern struct wrap { const struct wrap_ *ptr; } const w;
int main(void)
{
    printf("ret_42()=%d\n", ret_42());
    printf("ret_fn_result()=%d\n", ret_fn_result());
    printf("ret_global()=%d\n", ret_global());
    printf("w.ptr->x=%d\n",w.ptr->x);
}
EOF
for c in *.c; do gcc -fpic -O2 $c -c; done
$CC lib.o -o lib.so -shared 
$CC lib_override.o -o lib_override.so -shared
$CC main.o $PWD/lib.so
export LD_LIBRARY_PATH=$PWD
./a.out
LD_PRELOAD=$PWD/lib_override.so ./a.out
like image 1
muradm Avatar answered Nov 07 '22 20:11

muradm