I wanted to know how linker resolves the printf symbol in the following assembly code.
#include<stdio.h>
void main()
{
printf("Hello ");
}
.file "test.c"
.def ___main; .scl 2; .type 32; .endef
.section .rdata,"dr"
LC0:
.ascii "Hello \0"
.text
.globl _main
.def _main; .scl 2; .type 32; .endef
_main:
pushl %ebp
movl %esp, %ebp
subl $8, %esp
andl $-16, %esp
movl $0, %eax
addl $15, %eax
addl $15, %eax
shrl $4, %eax
sall $4, %eax
movl %eax, -4(%ebp)
movl -4(%ebp), %eax
call __alloca
call ___main
movl $LC0, (%esp)
**call _printf**
leave
ret
.def **_printf**; .scl 3; .type 32; .endef
Bit of Low Level Explanation will be highly appreciated.
Thanks in advance.
At compile time, the compiler exports each global symbol to the assembler as either strong or weak, and the assembler encodes this information implicitly in the symbol table of the relocatable object file.
The linker resolves symbol references by associating each ref with exactly one symbol definition from the symbol tables of its input relocatable obj files. Symbol resolution is easy for references to local variables that are defined in the same module as the ref.
An assembler converts source-code programs from assembly language into machine language, often referred to as object-code. A linker combines individual files created by an assembler into a single executable program.
Linkers are also called as link editors. Linking is a process of collecting and maintaining piece of code and data into a single file. Linker also links a particular module into system library. It takes object modules from assembler as input and forms an executable file as output for the loader.
Assuming ELF file format, the assembler will generate an undefined symbol reference in the object file. This'll look like this:
Symbol table '.symtab' contains 11 entries: Num: Value Size Type Bind Vis Ndx Name 0: 00000000 0 NOTYPE LOCAL DEFAULT UND 1: 00000000 0 FILE LOCAL DEFAULT ABS test.c 2: 00000000 0 SECTION LOCAL DEFAULT 1 3: 00000000 0 SECTION LOCAL DEFAULT 3 4: 00000000 0 SECTION LOCAL DEFAULT 4 5: 00000000 0 SECTION LOCAL DEFAULT 5 6: 00000000 0 SECTION LOCAL DEFAULT 6 7: 00000000 0 SECTION LOCAL DEFAULT 7 8: 00000000 52 FUNC GLOBAL DEFAULT 1 main 9: 00000000 0 NOTYPE GLOBAL DEFAULT UND printf 10: 00000000 0 NOTYPE GLOBAL DEFAULT UND exit
It'll also create a relocation entry to point to the part of the code image that needs to be updated by the linker with the correct address. It'll look like this:
$ readelf -r test.o Relocation section '.rel.text' at offset 0x358 contains 3 entries: Offset Info Type Sym.Value Sym. Name 0000001f 00000501 R_386_32 00000000 .rodata 00000024 00000902 R_386_PC32 00000000 printf 00000030 00000a02 R_386_PC32 00000000 exit
The linker's job is then to walk through the relocation table fixing up the code image with the final symbol addresses.
There's an excellent book, but I can't find the details right now (and it's out of print). However, this looks like it may be useful: http://www.linuxjournal.com/article/6463
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With