Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

symbol table and relocation table in object file

From what I understand, instructions and data in an object file all have addresses. First data item start at address 0 and first instruction also start at address 0.

The relocation table contains information about instructions that need to be updated if the addresses in the file change, for example if the file is linked together with another. Line A, in the example below, would be in the relocation table. I don't think B would be in the relocation table, since the address of label "equal" is relative to B. Are these correct assumptions?

I know the symbol table show the labels the file have and also labels that haven't been resolved. But what other information does the symbol table contain?

Also, when the assembler translates the instructions to binary, what is placed in those instructions that have unresolved references?. B in this example.

.data
TEXT: .asciiz "Foo"

.text
.global main
main:
     li t0, 1
     beq t0, 1, equal #B

equal:
    la a0, TEXT
    jal printf #A
like image 683
Carlj901 Avatar asked May 25 '13 12:05

Carlj901


People also ask

What is a relocation table?

The relocation table contains information about instructions that need to be updated if the addresses in the file change, for example if the file is linked together with another.

What information is stored in a relocation record and a symbol table?

Symbol table: name and current location of variable or procedure that can potentially be referenced in other object files. Relocation records : information about addresses referenced in this object file that the linker must adjust once it knows the final memory allocation.

What are relocations in an object file?

Relocatable files must have information that describes how to modify their section contents, thus allowing executable and shared object files to hold the right information for a process's program image. Relocation entries are these data. This member gives the location at which to apply the relocation action.

What is the purpose of symbol table?

Symbol table is an important data structure created and maintained by compilers in order to store information about the occurrence of various entities such as variable names, function names, objects, classes, interfaces, etc. Symbol table is used by both the analysis and the synthesis parts of a compiler.


Video Answer


1 Answers

Yes, your assumptions are correct. There are various types of relocations, what the assembler emits into the instruction depends on the type. Generally it's an offset to be added. You can use objdump -dr to see relocations. For better illustration I have changed your code a little:

.data
.int 0
TEXT: .asciiz "Foo"
.text
.global main
main:
     li $t0, 1
     beq $t0, 1, equal #B
     bne $t0, 42, foo  #C

equal:
     la $a0, TEXT
     jal printf #A

Output of objdump:

00000000 <main>:
   0:   24080001        li      t0,1
   4:   24010001        li      at,1
   8:   11010004        beq     t0,at,1c <equal>
   c:   00000000        nop
  10:   2401002a        li      at,42
  14:   1501ffff        bne     t0,at,14 <main+0x14>
                        14: R_MIPS_PC16 foo
  18:   00000000        nop

0000001c <equal>:
  1c:   3c040000        lui     a0,0x0
                        1c: R_MIPS_HI16 .data
  20:   0c000000        jal     0 <main>
                        20: R_MIPS_26   printf
  24:   24840004        addiu   a0,a0,4
                        24: R_MIPS_LO16 .data

As you said, there is no relocation for the beq since that's a relative address within this object file.

The bne I added (line marked with C) references an external symbol, so even though the address is relative a relocation entry is needed. It will be of type R_MIPS_PC16 to produce a 16 bit signed word offset to symbol foo. As the instruction encoding requires offset from the next word and not the current PC that the relocation uses, 1 has to be subtracted, and that's encoded as 2's complement ffff into the instruction itself.

The la pseudoinstruction has been translated by the assembler into a lui/addiu pair (the latter in the delay slot of the jal). For the lui a R_MIPS_HI16 relocation is created against the .data section which will fill in the top 16 bits. Since the symbol TEXT is at address 4 in the .data section, the top 16 bits of the offset are 0. This means the instruction contains 0 offset. Similarly, for the low 16 bits, except there the instruction contains an offset of 4.

Finally, the jal printf is using yet another kind of relocation that is tailored for the encoding required by the instruction. The offset is zero because the jump is directly to the referenced symbol. Note that objdump is trying to be helpful by decoding that, but it doesn't process the relocation so the <main> it outputs is of course nonsense.

like image 173
Jester Avatar answered Oct 13 '22 14:10

Jester