Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to get the actual address of `func` from `callq func@PLT`

In my Linux program, I need a function that takes an address addr and checks whether a callq instruction placed at addr is calling an specific function func loaded from a shared library. I mean, I need to check whether I have something like callq func@PLT at addr.

So, on Linux, how to reach the real address of a function func from a callq func@PLT instruction?

like image 477
LuisABOL Avatar asked Feb 25 '13 12:02

LuisABOL


1 Answers

You can only find out about that at runtime, after the dynamic linker resolves the actual load address.
Warning: What follows is slightly deeper magic ...

To illustrate what's happening use a debugger:

#include <stdio.h>

int main(int argc, char **argv) { printf("Hello, World!\n"); return 0; }

Compile it (gcc -O8 ...). objdump -d on the binary shows (the optimization of printf() being substituted with puts() for a plain string not withstanding ...):

Disassembly of section .init:
[ ... ]
Disassembly of section .plt:

0000000000400408 <__libc_start_main@plt-0x10>:
  400408:  ff 35 a2 04 10 00       pushq  1049762(%rip)        # 5008b0 <_GLOBAL_OFFSET_TABLE_+0x8>>
  40040e:  ff 25 a4 04 10 00       jmpq   *1049764(%rip)        # 5008b8 <_GLOBAL_OFFSET_TABLE_+0x10>
[ ... ]
0000000000400428 <puts@plt>:
  400428:  ff 25 9a 04 10 00       jmpq   *1049754(%rip)   # 5008c8 <_GLOBAL_OFFSET_TABLE_+0x20>
  40042e:  68 01 00 00 00          pushq  $0x1
  400433:  e9 d0 ff ff ff          jmpq   400408 <_init+0x18>
[ ... ]
0000000000400500 <main>:
  400500:  48 83 ec 08             sub    $0x8,%rsp
  400504:  bf 0c 06 40 00          mov    $0x40060c,%edi
  400509:  e8 1a ff ff ff          callq  400428 <puts@plt>
  40050e:  31 c0                   xor    %eax,%eax
  400510:  48 83 c4 08             add    $0x8,%rsp
  400514:  c3                      retq

Now load it into gdb. Then:

$ gdb ./tcc
GNU gdb Red Hat Linux (6.3.0.0-0.30.1rh)
[ ... ]
(gdb) x/3i 0x400428
0x400428:       jmpq   *1049754(%rip)        # 0x5008c8 <_GLOBAL_OFFSET_TABLE_+32>
0x40042e:       pushq  $0x1
0x400433:       jmpq   0x400408
(gdb) x/gx 0x5008c8
0x5008c8 <_GLOBAL_OFFSET_TABLE_+32>:    0x000000000040042e

Notice this value points back to the instruction directly following the first jmpq; this means the puts@plt slot, on first invocation, will simply "fall through" to:

(gdb) x/3i 0x400408
0x400408:       pushq  1049762(%rip)        # 0x5008b0 <_GLOBAL_OFFSET_TABLE_+8>
0x40040e:       jmpq   *1049764(%rip)        # 0x5008b8 <_GLOBAL_OFFSET_TABLE_+16>
0x400414:       nop
(gdb) x/gx 0x5008b0
0x5008b0 <_GLOBAL_OFFSET_TABLE_+8>:     0x0000000000000000
(gdb) x/gx 0x5008b8
0x5008b8 <_GLOBAL_OFFSET_TABLE_+16>:    0x0000000000000000

The function address and argument aren't initialized yet.
This is the state just after program load, but before executing. Now start executing it:

(gdb) break main
Breakpoint 1 at 0x400500
(gdb) run
Starting program: tcc
(no debugging symbols found)
(no debugging symbols found)

Breakpoint 1, 0x0000000000400500 in main ()
(gdb)  x/i 0x400428
0x400428:  jmpq   *1049754(%rip)        # 0x5008c8 <_GLOBAL_OFFSET_TABLE_+32>
(gdb) x/gx 0x5008c8
0x5008c8 <_GLOBAL_OFFSET_TABLE_+32>:    0x000000000040042e

So this hasn't changed yet - but the targets (the GOT contents for the libc initialization) are different now:

(gdb) x/gx 0x5008b0
0x5008b0 <_GLOBAL_OFFSET_TABLE_+8>:     0x0000002a9566b9a8
(gdb) x/gx 0x5008b8
0x5008b8 <_GLOBAL_OFFSET_TABLE_+16>:    0x0000002a955609f0
(gdb) disas 0x0000002a955609f0
Dump of assembler code for function _dl_runtime_resolve:
0x0000002a955609f0 <_dl_runtime_resolve+0>:     sub    $0x38,%rsp
[ ... ]

I.e. at program load time, the dynamic linker will resolve the "init" parts first. It substitutes the GOT references with pointers that redirect into the dynamic linking code.

Therefore, when first calling an external-to-the-binary function through the .plt reference, it'll jump into the linker again. Let it do that, then inspect the program after that - the state has changed again:

(gdb) break *0x0000000000400514
Breakpoint 2 at 0x400514
(gdb) continue
Continuing.
Hello, World!

Breakpoint 2, 0x0000000000400514 in main ()
(gdb) x/i 0x400428
0x400428:  jmpq   *1049754(%rip)        # 0x5008c8 <_GLOBAL_OFFSET_TABLE_+32>
(gdb) x/gx 0x5008c8
0x5008c8 :    0x0000002a956c8870
(gdb) disas 0x0000002a956c8870
Dump of assembler code for function puts:
0x0000002a956c8870 <puts+0>:    mov    %rbx,0xffffffffffffffe0(%rsp)
[ ... ]

So there's your redirect right into libc now - the PLT reference to puts() finally got resolved.

The instructions to the linker where to insert the actual function load addresses (that we've seen it do for _dl_runtime_resolve comes from special sections in the ELF binary:

$ readelf -a tcc
[ ... ]
Program Headers:
  Type           Offset             VirtAddr           PhysAddr
                 FileSiz            MemSiz              Flags  Align
[ ... ]
  INTERP         0x0000000000000200 0x0000000000400200 0x0000000000400200
                 0x000000000000001c 0x000000000000001c  R      1
      [Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
[ ... ]
Dynamic section at offset 0x700 contains 21 entries:
  Tag        Type                         Name/Value
 0x0000000000000001 (NEEDED)             Shared library: [libc.so.6]
[ ... ]
Relocation section '.rela.plt' at offset 0x3c0 contains 2 entries:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
0000005008c0  000100000007 R_X86_64_JUMP_SLO 0000000000000000 __libc_start_main + 0
0000005008c8  000200000007 R_X86_64_JUMP_SLO 0000000000000000 puts + 0

There's more to ELF than just the above, but these three pieces tell the kernel's binary format handler "this ELF binary has an interpreter" (which is the dynamic linker) that needs to be loaded / initialized first, that it requires libc.so.6, and that offsets 0x5008c0 and 0x5008c8 in the program's writeable data section must be substituted by the load addresses for __libc_start_main and puts, respectively, when the step of dynamic linking is actually performed.

How exactly that happens, from ELF's point of view, is up to the details of the interpreter (aka, the dynamic linker implementation).

like image 180
FrankH. Avatar answered Nov 07 '22 00:11

FrankH.