It is possible to move some of the functions in the code in a specific section on the executable? If so, how?
For an application compiled with gcc, we have more source files, including X.c. Each object is compiled from the associated source (X.o is obtained from X.c) and the linker produces a big executable.
I need two functions from X.c to be in a specific section in the executable, say .magic_section. The reason I want this is that the section will be loaded in another area of memory than the rest of the sections.
My problem is that I can not change the source X.c, otherwise I would have used
a specific flag, such as __attribute__ ((section ("magic_section")))
for
the functions.
I read something in the documentation for the linker and wrote a custom script for the linker, but I failed to specify in which section a particular symbol must be placed. I only managed to move a whole section.
On way you could do probably do it (not great, but should work in theory) is to use --function-sections
and --data-sections
, assuming your GCC version / architecture supports those options, and then manually call out all the functions & variables that need to go in a given file with a linker script.
This creates sections called like things .text.function_name
or .data.variable_name
. If you're familiar with assigning sections via gcc attributes, I'll assume you know what to do in the linker.
As an advantage, that would let you cherry-pick functions if you don't actually want the entire file to go in a magic section.
Unfortunately, without modifying your binary objects, dynamic linker or dynamic loader you will not be able to accomplish this, and anyhow, this is a very difficult task.
Option 1 - ELF manipulation
Each ELF executable is made from sections, which contain the actual code/data/symbol strings/... and segments which help the loader decide things like where to load your code in memory, which symbols this ELF exposes, which symbols it requires from other locations, where to load specific code/data, etc.
You can observe the segments in your binary by typing
readelf -l [your binary]
The output will be similiar to the following (I chose ls as the binary):
[ishaypeled@ishay-dev bin]$ readelf -l --wide ./ls
Elf file type is EXEC (Executable file)
Entry point 0x4048bf
There are 9 program headers, starting at offset 64
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
PHDR 0x000040 0x0000000000400040 0x0000000000400040 0x0001f8 0x0001f8 R E 0x8
INTERP 0x000238 0x0000000000400238 0x0000000000400238 0x00001c 0x00001c R 0x1
[Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
LOAD 0x000000 0x0000000000400000 0x0000000000400000 0x01b694 0x01b694 R E 0x200000
LOAD 0x01bdf0 0x000000000061bdf0 0x000000000061bdf0 0x000864 0x0016d0 RW 0x200000
DYNAMIC 0x01be08 0x000000000061be08 0x000000000061be08 0x0001f0 0x0001f0 RW 0x8
NOTE 0x000254 0x0000000000400254 0x0000000000400254 0x000044 0x000044 R 0x4
GNU_EH_FRAME 0x01895c 0x000000000041895c 0x000000000041895c 0x00071c 0x00071c R 0x4
GNU_STACK 0x000000 0x0000000000000000 0x0000000000000000 0x000000 0x000000 RW 0x10
GNU_RELRO 0x01bdf0 0x000000000061bdf0 0x000000000061bdf0 0x000210 0x000210 R 0x1
Section to Segment mapping:
Segment Sections...
00
01 .interp
02 .interp .note.ABI-tag .note.gnu.build-id .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rela.dyn .rela.plt .init .plt .text .fini .rodata .eh_frame_hdr .eh_frame
03 .init_array .fini_array .jcr .dynamic .got .got.plt .data .bss
04 .dynamic
05 .note.ABI-tag .note.gnu.build-id
06 .eh_frame_hdr
07
08 .init_array .fini_array .jcr .dynamic .got
Now let's examine this output:
In the first table (Program Headers):
[Type] - Segment type, what is the purpose of this section
[Offset] - Offset in file where this segment begins
[VirtAddr] - Where we want to load this section in process address space (if this segment should be loaded at all, not all of them are loaded)
[PhysAddr] - Same as VirtAddr for all modern OS's I encountered
[FileSiz] - How big is this section on file. This is the link to your sections - the current segment consists of all sections in the range Offset to Offset+FileSiz
[MemSiz] - How big is this section in virtual memory (this does NOT have to be the same as the size on file! if it spans beyond the size in file the excess is set to 0)
[Flg] - Permission flags, R-read E-execute W-write.
[Align] - Required memory alignment in memory.
Your focus is on segments of type LOAD (PT_LOAD). These segments group data from sections, instruct the loader where to put them in the process address space and determine specify their permissions.
You can see a convenient section to segment mapping in the Section to Segment mapping table.
Lets observe the two LOAD segments 2 and 3:
We can see that segment 2 has read and execute permissions, and that it spans (among other) the .text and .rodata sections.
So, to achieve your purpose using ELF manipulation:
If you read up to here and understood everything, you should know this is a tremendously tedious, nearly impossible task for real life cases.
Option 2 - Dynamic linker manipulation
Note the INTERP segment type in the above example. This is an ASCII string that specifies which dynamic linker you should use.
The dynamic linker role is to parse the segments and perform all dynamic operations such as resolving symbols at runtime, loading segments from .so file, etc.
A possible manipulation here would be to modify the dynamic linker code (NOTE: this is a system wide change!) to load the functions binary data into a specific memory address in the process address space. Note that this approach has a couple of set backs:
Option 3 - Dynamic loader manipulation Much like option 2, but modify the ld library facilities instead of the dynamic linker.
Conclusion
Exactly what you wish to do is very hard, and indeed a tedious task. I am working on a tool that allows injection of arbitrary functions into existing shared object files at the moment and I guarantee this to be at least a few good weeks of work.
Are you sure there isn't another way to achieve what you want? Why do you need these two functions in a separate address? Perhaps there is an easier solution...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With