Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does a Program Compiled with -fpic and -pie Have Relocation Table?

If a trivial program is compiled with the following command:

arm-none-eabi-gcc -shared -fpic -pie --specs=nosys.specs simple.c -o simple.exe

and the relocation entries are printed with the command:

arm-none-eabi-readelf simple.exe -r

There are a bunch of relocation entries section (see below).

Since -fpic / -pie flags cause the compiler to generate a position independent executable, my naive (and clearly incorrect) assumption is that there is no need for a relocation table because the loader can place the executable image anywhere without issue. So why is there a relocation table there at all, and does this indicate that the code isn't actually position independent?

Relocation section '.rel.dyn' at offset 0x82d4 contains 37 entries:
 Offset     Info    Type            Sym.Value  Sym. Name
000084a8  00000017 R_ARM_RELATIVE   
000084d0  00000017 R_ARM_RELATIVE   
00008508  00000017 R_ARM_RELATIVE   
00008510  00000017 R_ARM_RELATIVE   
0000855c  00000017 R_ARM_RELATIVE   
00008560  00000017 R_ARM_RELATIVE   
00008564  00000017 R_ARM_RELATIVE   
00008678  00000017 R_ARM_RELATIVE   
0000867c  00000017 R_ARM_RELATIVE   
0000870c  00000017 R_ARM_RELATIVE   
00008710  00000017 R_ARM_RELATIVE   
00008714  00000017 R_ARM_RELATIVE   
00008718  00000017 R_ARM_RELATIVE   
00008978  00000017 R_ARM_RELATIVE   
000089dc  00000017 R_ARM_RELATIVE   
000089e0  00000017 R_ARM_RELATIVE   
00008abc  00000017 R_ARM_RELATIVE   
00008ae4  00000017 R_ARM_RELATIVE   
00018af4  00000017 R_ARM_RELATIVE   
00018af8  00000017 R_ARM_RELATIVE   
00018afc  00000017 R_ARM_RELATIVE   
00018c04  00000017 R_ARM_RELATIVE   
00018c08  00000017 R_ARM_RELATIVE   
00018c0c  00000017 R_ARM_RELATIVE   
00018c34  00000017 R_ARM_RELATIVE   
00019028  00000017 R_ARM_RELATIVE   
000084cc  00000c02 R_ARM_ABS32       00000000   __libc_fini
0000850c  00000602 R_ARM_ABS32       00000000   __deregister_frame_inf
00008558  00001302 R_ARM_ABS32       00000000   __register_frame_info
00008568  00001202 R_ARM_ABS32       00000000   _Jv_RegisterClasses
00008664  00000d02 R_ARM_ABS32       00000000   __stack
00008668  00000a02 R_ARM_ABS32       00000000   hardware_init_hook
0000866c  00000802 R_ARM_ABS32       00000000   software_init_hook
00008670  00000502 R_ARM_ABS32       0001902c   __bss_start__
00008674  00000702 R_ARM_ABS32       00019048   __bss_end__
0000897c  00001402 R_ARM_ABS32       00000000   free
00008ac0  00000402 R_ARM_ABS32       00000000   malloc

Relocation section '.rel.plt' at offset 0x83fc contains 4 entries:
 Offset     Info    Type            Sym.Value  Sym. Name
00018be8  00000416 R_ARM_JUMP_SLOT   00000000   malloc
00018bec  00000616 R_ARM_JUMP_SLOT   00000000   __deregister_frame_inf
00018bf0  00001316 R_ARM_JUMP_SLOT   00000000   __register_frame_info
00018bf4  00001416 R_ARM_JUMP_SLOT   00000000   free
like image 270
PeterM Avatar asked May 12 '16 22:05

PeterM


People also ask

What is program relocation and why it is need?

Relocation is the process of connecting symbolic references with symbolic definitions. For example, when a program calls a function, the associated call instruction must transfer control to the proper destination address at execution.

What is a relocation table?

Relocation table. The relocation table is a list of pointers created by the translator (a compiler or assembler) and stored in the object or executable file.

What is the need of relocation and linking in linkers?

After the runtime linker has loaded all the dependencies required by an application, the linker processes each object and performs all necessary relocations. During the link-editing of an object, any relocation information supplied with the input relocatable objects is applied to the output file.

What is relocation in assembler?

A relocation is a directive embedded in the object file that enables source code to refer to a label whose target address is unknown or cannot be calculated at assembly time. The assembler emits a relocation in the object file, and the linker resolves this to the address where the target is placed.


1 Answers

An executable consists of several sections. While actual implementation details differ, these can be roughly categorized in four groups:

  1. Read-Only Executable Code, also known as "Text"
  2. Read-Only Constant Data (global constants)
  3. (Initialized) Read-Write Data (global variables with initializers)
  4. Uninitialized Read-Write Data (other global variables, initialized to 0)

Non-position-independent code contains a lot of references to the addresses of functions, global variables and global constsants.

Read-Only Data and Initialized Read-Write Data sometimes contain references to the addresses of functions, global variables and global constsants:

int x;
int *y = &x; // y needs a relocation.

The loader can relocate code based on relocations, there are only two problems:

  1. Relocations take time on program startup / library loading
  2. If we relocate, we now have an in-RAM modified copy of the text segment, which is different for every process that loads our library, so we will be wasting RAM.

Now for the real answer: PIC was intended to solve the above problems by getting rid of text relocations, not to get rid of all relocations.

There are comparatively few relocations in read-only data and initialized data, so neither (1.) nor (2.) are usually an issue. We don't even care about (2.) for read-write data, as we need separate copies of that for each process, anyway. And in fact, there is no way for the compiler to make data position-independent, because if you asked for a global int* y = &x; then the compiler has no choice but to put the pointer there.

Now, how is code made position-independent? That depends on the platform, but it often involves a few relatively inefficient operations, or the processor imposes arbitrary limits on the maximum offsets used in the more efficient instructions for accessing data & code in a position-independent way. Also, dynamic linking means the address of some functions isn't even known as a relative offset, either. So, compilers tend to use tables that contain the actual addresses, and the code will look up the actual addresses from the table. The tables, variously known as GOT, TOC, PLT and probably a few other names on different platforms, will likely be Constant Data with lots of relocations.

If relocations can't be avoided, the idea is to put them all into one place to minimize problems (1.) and (2.).

like image 72
wolfgang Avatar answered Oct 21 '22 10:10

wolfgang