Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is jumping over/removing `PHDR` program header in ELF file for executable OK? If so, why?

Tags:

c++

c

elf

readelf

I was doing some hacking in the binary for this simple C++ program to understand program headers for ELF:

int main(){ }

compiled with:

❯ make
g++ -O0 -fverbose-asm -no-pie -o main main.cpp

I used readelf -l main to get the following:

Elf file type is EXEC (Executable file)
Entry point 0x401020
There are 11 program headers, starting at offset 64

Program Headers:
  Type           Offset             VirtAddr           PhysAddr
                 FileSiz            MemSiz              Flags  Align
  PHDR           0x0000000000000040 0x0000000000400040 0x0000000000400040
                 0x0000000000000268 0x0000000000000268  R      0x8
  INTERP         0x00000000000002a8 0x00000000004002a8 0x00000000004002a8
                 0x000000000000001c 0x000000000000001c  R      0x1
      [Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
  LOAD           0x0000000000000000 0x0000000000400000 0x0000000000400000
                 0x00000000000004c0 0x00000000000004c0  R      0x1000
...

I see in this documentation: http://man7.org/linux/man-pages/man5/elf.5.html for PHDR:

The array element, if present, specifies the loca‐ tion and size of the program header table itself, both in the file and in the memory image of the pro‐ gram. This segment type may not occur more than once in a file. Moreover, it may occur only if the program header table is part of the memory image of the program. If it is present, it must precede any loadable segment entry.

The presence of if present in the quote made me wonder what would happen if I just jumped over the PHDR header. I used vim's hex editor to change the binary layout of main using :%!xxd (be sure to run :%!xxd -r before saving, or else it's not a binary file anymore) to get from:

00000000: 7f45 4c46 0201 0100 0000 0000 0000 0000  .ELF............
00000010: 0200 3e00 0100 0000 2010 4000 0000 0000  ..>..... .@.....
00000020: 4000 0000 0000 0000 1839 0000 0000 0000  @........9......

to:

00000000: 7f45 4c46 0201 0100 0000 0000 0000 0000  .ELF............
00000010: 0200 3e00 0100 0000 2010 4000 0000 0000  ..>..... .@.....
00000020: 7800 0000 0000 0000 1839 0000 0000 0000  @........9......

(Only changing the 20th byte), to jump over the length of the PHDR header. I run readelf again to verify it's still a valid ELF file:

❯ readelf -l main

Elf file type is EXEC (Executable file)
Entry point 0x401020
There are 11 program headers, starting at offset 120

Program Headers:
  Type           Offset             VirtAddr           PhysAddr
                 FileSiz            MemSiz              Flags  Align
  INTERP         0x00000000000002a8 0x00000000004002a8 0x00000000004002a8
                 0x000000000000001c 0x000000000000001c  R      0x1
      [Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
  ...

And surprisingly the program still executes perfectly fine. Why do we even need the PHDR header? Is it useful for linking and/or other situations? It seems like it's not used during runtime at all so why do we have this lying around?

like image 845
OneRaynyDay Avatar asked Mar 02 '23 12:03

OneRaynyDay


1 Answers

If the main program is of type ET_EXEC (non-PIE), it's probably runnable without PT_PHDR. The main use of PT_PHDR is being able to compare the (unrelocated) address in the header with the actual runtime address of the program headers (obtained by the dynamic linker via AT_PHDR in the aux vector) to determine the offset at which the PIE executable was loaded.

I'm not sure what glibc's dynamic linker's requirements for having PT_PHDR are, but in musl libc's we only need it for computing this load offset, and otherwise it's not used at all.

like image 137
R.. GitHub STOP HELPING ICE Avatar answered Mar 05 '23 15:03

R.. GitHub STOP HELPING ICE