Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

GCC linker: move a symbol in a specified section

It is possible to move some of the functions in the code in a specific section on the executable? If so, how?

For an application compiled with gcc, we have more source files, including X.c. Each object is compiled from the associated source (X.o is obtained from X.c) and the linker produces a big executable.

I need two functions from X.c to be in a specific section in the executable, say .magic_section. The reason I want this is that the section will be loaded in another area of memory than the rest of the sections.

My problem is that I can not change the source X.c, otherwise I would have used a specific flag, such as __attribute__ ((section ("magic_section"))) for the functions.

I read something in the documentation for the linker and wrote a custom script for the linker, but I failed to specify in which section a particular symbol must be placed. I only managed to move a whole section.

like image 597
MathPlayer Avatar asked Mar 24 '15 09:03

MathPlayer


2 Answers

On way you could do probably do it (not great, but should work in theory) is to use --function-sections and --data-sections, assuming your GCC version / architecture supports those options, and then manually call out all the functions & variables that need to go in a given file with a linker script.

This creates sections called like things .text.function_name or .data.variable_name. If you're familiar with assigning sections via gcc attributes, I'll assume you know what to do in the linker.

As an advantage, that would let you cherry-pick functions if you don't actually want the entire file to go in a magic section.

like image 76
Brian McFarland Avatar answered Oct 13 '22 20:10

Brian McFarland


Unfortunately, without modifying your binary objects, dynamic linker or dynamic loader you will not be able to accomplish this, and anyhow, this is a very difficult task.

Option 1 - ELF manipulation

Each ELF executable is made from sections, which contain the actual code/data/symbol strings/... and segments which help the loader decide things like where to load your code in memory, which symbols this ELF exposes, which symbols it requires from other locations, where to load specific code/data, etc.

You can observe the segments in your binary by typing

readelf -l [your binary]

The output will be similiar to the following (I chose ls as the binary):

[ishaypeled@ishay-dev bin]$ readelf -l --wide ./ls

Elf file type is EXEC (Executable file)
Entry point 0x4048bf
There are 9 program headers, starting at offset 64

Program Headers:
  Type           Offset   VirtAddr           PhysAddr           FileSiz  MemSiz   Flg Align
  PHDR           0x000040 0x0000000000400040 0x0000000000400040 0x0001f8 0x0001f8 R E 0x8
  INTERP         0x000238 0x0000000000400238 0x0000000000400238 0x00001c 0x00001c R   0x1
      [Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
  LOAD           0x000000 0x0000000000400000 0x0000000000400000 0x01b694 0x01b694 R E 0x200000
  LOAD           0x01bdf0 0x000000000061bdf0 0x000000000061bdf0 0x000864 0x0016d0 RW  0x200000
  DYNAMIC        0x01be08 0x000000000061be08 0x000000000061be08 0x0001f0 0x0001f0 RW  0x8
  NOTE           0x000254 0x0000000000400254 0x0000000000400254 0x000044 0x000044 R   0x4
  GNU_EH_FRAME   0x01895c 0x000000000041895c 0x000000000041895c 0x00071c 0x00071c R   0x4
  GNU_STACK      0x000000 0x0000000000000000 0x0000000000000000 0x000000 0x000000 RW  0x10
  GNU_RELRO      0x01bdf0 0x000000000061bdf0 0x000000000061bdf0 0x000210 0x000210 R   0x1

 Section to Segment mapping:
  Segment Sections...
   00     
   01     .interp 
   02     .interp .note.ABI-tag .note.gnu.build-id .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rela.dyn .rela.plt .init .plt .text .fini .rodata .eh_frame_hdr .eh_frame 
   03     .init_array .fini_array .jcr .dynamic .got .got.plt .data .bss 
   04     .dynamic 
   05     .note.ABI-tag .note.gnu.build-id 
   06     .eh_frame_hdr 
   07     
   08     .init_array .fini_array .jcr .dynamic .got 

Now let's examine this output:

In the first table (Program Headers):
[Type] - Segment type, what is the purpose of this section
[Offset] - Offset in file where this segment begins
[VirtAddr] - Where we want to load this section in process address space (if this segment should be loaded at all, not all of them are loaded)
[PhysAddr] - Same as VirtAddr for all modern OS's I encountered
[FileSiz] - How big is this section on file. This is the link to your sections - the current segment consists of all sections in the range Offset to Offset+FileSiz
[MemSiz] - How big is this section in virtual memory (this does NOT have to be the same as the size on file! if it spans beyond the size in file the excess is set to 0)
[Flg] - Permission flags, R-read E-execute W-write. [Align] - Required memory alignment in memory.

Your focus is on segments of type LOAD (PT_LOAD). These segments group data from sections, instruct the loader where to put them in the process address space and determine specify their permissions.

You can see a convenient section to segment mapping in the Section to Segment mapping table.

Lets observe the two LOAD segments 2 and 3:
We can see that segment 2 has read and execute permissions, and that it spans (among other) the .text and .rodata sections.

So, to achieve your purpose using ELF manipulation:

  1. Locate the binary data that makes your functions in the file (readelf utility is your friend)
  2. By modifying the ELF header (I don't know any tool that does this automatically, you'd probably have to write your own) split the segment containing .text section into two sequential LOAD segments, leaving out your function code
  3. By modifying the ELF header create a new LOAD segment containing only your two functions
  4. Update all references (if any) to the old function location to the new one

If you read up to here and understood everything, you should know this is a tremendously tedious, nearly impossible task for real life cases.

Option 2 - Dynamic linker manipulation Note the INTERP segment type in the above example. This is an ASCII string that specifies which dynamic linker you should use.
The dynamic linker role is to parse the segments and perform all dynamic operations such as resolving symbols at runtime, loading segments from .so file, etc.

A possible manipulation here would be to modify the dynamic linker code (NOTE: this is a system wide change!) to load the functions binary data into a specific memory address in the process address space. Note that this approach has a couple of set backs:

  1. It requires modification to the dynamic linker
  2. You need to update all references to your functions within the ELF file still

Option 3 - Dynamic loader manipulation Much like option 2, but modify the ld library facilities instead of the dynamic linker.

Conclusion Exactly what you wish to do is very hard, and indeed a tedious task. I am working on a tool that allows injection of arbitrary functions into existing shared object files at the moment and I guarantee this to be at least a few good weeks of work.
Are you sure there isn't another way to achieve what you want? Why do you need these two functions in a separate address? Perhaps there is an easier solution...

like image 35
Ishay Peled Avatar answered Oct 13 '22 21:10

Ishay Peled