I was doing some hacking in the binary for this simple C++ program to understand program headers for ELF:
int main(){ }
compiled with:
❯ make
g++ -O0 -fverbose-asm -no-pie -o main main.cpp
I used readelf -l main
to get the following:
Elf file type is EXEC (Executable file)
Entry point 0x401020
There are 11 program headers, starting at offset 64
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
PHDR 0x0000000000000040 0x0000000000400040 0x0000000000400040
0x0000000000000268 0x0000000000000268 R 0x8
INTERP 0x00000000000002a8 0x00000000004002a8 0x00000000004002a8
0x000000000000001c 0x000000000000001c R 0x1
[Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
LOAD 0x0000000000000000 0x0000000000400000 0x0000000000400000
0x00000000000004c0 0x00000000000004c0 R 0x1000
...
I see in this documentation: http://man7.org/linux/man-pages/man5/elf.5.html for PHDR:
The array element, if present, specifies the loca‐ tion and size of the program header table itself, both in the file and in the memory image of the pro‐ gram. This segment type may not occur more than once in a file. Moreover, it may occur only if the program header table is part of the memory image of the program. If it is present, it must precede any loadable segment entry.
The presence of if present
in the quote made me wonder what would happen if I just jumped over the PHDR header. I used vim's hex editor to change the binary layout of main
using :%!xxd
(be sure to run :%!xxd -r
before saving, or else it's not a binary file anymore) to get from:
00000000: 7f45 4c46 0201 0100 0000 0000 0000 0000 .ELF............
00000010: 0200 3e00 0100 0000 2010 4000 0000 0000 ..>..... .@.....
00000020: 4000 0000 0000 0000 1839 0000 0000 0000 @........9......
to:
00000000: 7f45 4c46 0201 0100 0000 0000 0000 0000 .ELF............
00000010: 0200 3e00 0100 0000 2010 4000 0000 0000 ..>..... .@.....
00000020: 7800 0000 0000 0000 1839 0000 0000 0000 @........9......
(Only changing the 20th byte), to jump over the length of the PHDR header. I run readelf
again to verify it's still a valid ELF file:
❯ readelf -l main
Elf file type is EXEC (Executable file)
Entry point 0x401020
There are 11 program headers, starting at offset 120
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
INTERP 0x00000000000002a8 0x00000000004002a8 0x00000000004002a8
0x000000000000001c 0x000000000000001c R 0x1
[Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
...
And surprisingly the program still executes perfectly fine. Why do we even need the PHDR header? Is it useful for linking and/or other situations? It seems like it's not used during runtime at all so why do we have this lying around?
If the main program is of type ET_EXEC
(non-PIE), it's probably runnable without PT_PHDR
. The main use of PT_PHDR
is being able to compare the (unrelocated) address in the header with the actual runtime address of the program headers (obtained by the dynamic linker via AT_PHDR
in the aux vector) to determine the offset at which the PIE executable was loaded.
I'm not sure what glibc's dynamic linker's requirements for having PT_PHDR
are, but in musl libc's we only need it for computing this load offset, and otherwise it's not used at all.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With