Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Duplicated memory management symbols in libc.so and ld-linux.so

Some preamble

It seems that malloc, calloc, realloc and free are all replicated in ld-linux.so and libc.so . As I understand it, that is done by the dynamic loader to take care of memory management within ld-linux.so before libc.so is loaded and makes its memory management functions aviable. However, I have some questions about those duplicated symbols:

Here's a very simple C program calling malloc and exiting:

#include <stdlib.h>

int main()
{
  void *p = malloc(8);
  return 0;
}

I compile it with gcc in an x86_64 linux box and make some debugging with gdb:

$ gcc -g -o main main.c
$ gdb ./main
(gdb) start
Temporary breakpoint 1 at 0x4004f8
Starting program: main 

Temporary breakpoint 1, 0x00000000004004f8 in main ()
(gdb) info symbol malloc
malloc in section .text of /lib64/ld-linux-x86-64.so.2
(gdb) b malloc
Breakpoint 2 at 0x7ffff7df0930: malloc. (2 locations)
(gdb) info breakpoints
Num     Type           Disp Enb Address            What
2       breakpoint     keep y   <MULTIPLE>         
2.1                         y     0x00007ffff7df0930 in malloc at dl-minimal.c:95
2.2                         y     0x00007ffff7a9f9d0 in __GI___libc_malloc at malloc.c:2910

nm in libc.so and ld.so reveals the following:

$ nm -D /lib/x86_64-linux-gnu/libc.so.6 | grep malloc
00000000000829d0 T __libc_malloc
00000000003b6700 V __malloc_hook
00000000003b8b00 V __malloc_initialize_hook
00000000000829d0 T malloc
0000000000082db0 W malloc_get_state
00000000000847c0 T malloc_info
0000000000082480 W malloc_set_state
00000000000844f0 W malloc_stats
0000000000084160 W malloc_trim
00000000000844b0 W malloc_usable_size

$ nm -D /lib64/ld-linux-x86-64.so.2 | grep malloc
0000000000016930 W malloc

Questions

  1. malloc is replicated in libc.so and ld-linux.so but in the case of ld-linux.so it is a weak symbol, so they should both resolve to the same address. Additionally, as I understand it, the dynamic loader's symbol resolution table is global and resolves only one address per symbol (correct me if I'm wrong).

    However, gdb clearly shows otherwise (two different addresses). Why is that?

  2. gdb effectively breaks at two different addresses when typing break malloc but only shows information of a symbol in ld.so when typing info symbol malloc. Why is that?

  3. Although I am breaking at malloc and libc.so defines a malloc symbol of its own (as shown by nm), gdb breaks at symbol __GI___libc_malloc . Why is that?

like image 638
fons Avatar asked Feb 14 '13 02:02

fons


1 Answers

  1. I suspect GDB just puts breakpoint on all malloc symbols it can find, "just in case" so to speak. GDB uses its internal symbol table, not the dynamic loader's. This way it can break on non-exported symbols, if you have debug symbols. The command feedback only lists one address probably to reduce noise in case of too many matches. It still mentions "2 locations" so you can inspect it yourself with info breakpoints.
  2. My guess is that info symbol implementer just did not foresee this situation so it prints just the first match
  3. __GI___libc_malloc is the name of the internal, actual implementation of malloc inside libc.so. Since you also get source line info "at malloc.c:2910", I'm guessing it comes from the debug symbols and not from the ELF's symtab. Again, one location can have many names (see __libc_malloc in the symbol list), so GDB just chooses one.

BTW, the malloc's pointer in ld.so's GOT does get replaced by libc's malloc address when libc.so gets loaded (initially it points to the internal implementation). So you do get the same address for both when the process entry point is reached, and ld.so's malloc is not used anymore.

like image 61
Igor Skochinsky Avatar answered Nov 14 '22 23:11

Igor Skochinsky