I am trying to find the base address of ELF files. I know that you can use readelf to find the Program Entry Point and different section details (base address, size, flags and so on).
For example, programs for x86 architecture are based at 0x8048000 by linker. using readelf I can see the program entry point but no specific field in the output tells the base address.
$ readelf -e test
ELF Header:
Magic: 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00
Class: ELF32
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: EXEC (Executable file)
Machine: Intel 80386
Version: 0x1
Entry point address: 0x8048390
Start of program headers: 52 (bytes into file)
Start of section headers: 4436 (bytes into file)
Flags: 0x0
Size of this header: 52 (bytes)
Size of program headers: 32 (bytes)
Number of program headers: 9
Size of section headers: 40 (bytes)
Number of section headers: 30
Section Headers:
[Nr] Name Type Addr Off Size ES Flg Lk Inf Al
[ 0] NULL 00000000 000000 000000 00 0 0 0
[ 1] .interp PROGBITS 08048154 000154 000013 00 A 0 0 1
[ 2] .note.ABI-tag NOTE 08048168 000168 000020 00 A 0 0 4
[ 3] .note.gnu.build-i NOTE 08048188 000188 000024 00 A 0 0 4
[ 4] .gnu.hash GNU_HASH 080481ac 0001ac 000024 04 A 5 0 4
[ 5] .dynsym DYNSYM 080481d0 0001d0 000070 10 A 6 1 4
In the section details, I can see that the Offset is calculated with respect to the base address of the ELF.
So, .dynsym
section starts at address, 0x080481d0 and offset is 0x1d0. This would mean the base address is, 0x08048000. Is this correct?
similarly, for programs compiled on different architectures like PPC, ARM, MIPS, I cannot see their base address but only the OEP, Section Headers.
You need to check the segment table aka program headers (readelf -l
).
Elf file type is EXEC (Executable file)
Entry point 0x804a7a0
There are 9 program headers, starting at offset 52
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
PHDR 0x000034 0x08048034 0x08048034 0x00120 0x00120 R E 0x4
INTERP 0x000154 0x08048154 0x08048154 0x00013 0x00013 R 0x1
[Requesting program interpreter: /lib/ld-linux.so.2]
LOAD 0x000000 0x08048000 0x08048000 0x10fc8 0x10fc8 R E 0x1000
LOAD 0x011000 0x08059000 0x08059000 0x0038c 0x01700 RW 0x1000
DYNAMIC 0x01102c 0x0805902c 0x0805902c 0x000f8 0x000f8 RW 0x4
NOTE 0x000168 0x08048168 0x08048168 0x00020 0x00020 R 0x4
TLS 0x011000 0x08059000 0x08059000 0x00000 0x0005c R 0x4
GNU_EH_FRAME 0x00d3c0 0x080553c0 0x080553c0 0x00c5c 0x00c5c R 0x4
GNU_STACK 0x000000 0x00000000 0x00000000 0x00000 0x00000 RW 0x4
The first (lowest) LOAD
segment's virtual address is the default load base of the file. You can see it's 0x08048000 for this file.
The ELF mapping base Address of the .text section is defined by the ld(1) loader script in the binutils project under script template elf.sc on Linux.
The script define the following variables used by the loader ld(1):
# TEXT_START_ADDR - the first byte of the text segment, after any
# headers.
# TEXT_BASE_ADDRESS - the first byte of the text segment.
# TEXT_START_SYMBOLS - symbols that appear at the start of the
# .text section.
You can inspect the current values using the command:
~$ ld --verbose |grep SEGMENT_START
PROVIDE (__executable_start = SEGMENT_START("text-segment", 0x400000)); . = SEGMENT_START("text-segment", 0x400000) + SIZEOF_HEADERS;
. = SEGMENT_START("ldata-segment", .);
The text-segment mapping values are:
Also the interpreter base address of an ELF program is defined in the Auxiliary vector array at the index AT_BASE. The Auxiliary vector array is an array of the Elf_auxv_t structure and located after the envp in the process stack. It's configured while loading the ELF binary in the function create_elf_tables() of Linux kernel fs/binfmt_elf.c. The following code snippet show how to read the value:
$ cat at_base.c
#include <stdio.h>
#include <elf.h>
int
main(int argc, char* argv[], char* envp[])
{
Elf64_auxv_t *auxp;
while(*envp++ != NULL);
for (auxp = (Elf64_auxv_t *)envp; auxp->a_type != 0; auxp++) {
if (auxp->a_type == 7) {
printf("AT_BASE: 0x%lx\n", auxp->a_un.a_val);
}
}
}
$ clang -o at_base at_base.c
$ ./at_base
AT_BASE: 0x7fcfd4025000
Linux Auxiliary Vector definition Auxiliary Vector Reference
It used to be a fixed address on x86 32 bits architecture, but with ASLR now, it's randomized. You can use setarch i386 -R to disable randomization if you want.
It's defined in the linker script. You can dump the default linker script with ld --verbose
. Example output:
GNU ld (GNU Binutils) 2.23.1
Supported emulations:
elf_x86_64
elf32_x86_64
elf_i386
i386linux
elf_l1om
elf_k1om
using internal linker script:
==================================================
/* Script for -z combreloc: combine and sort reloc sections */
OUTPUT_FORMAT("elf64-x86-64", "elf64-x86-64",
"elf64-x86-64")
OUTPUT_ARCH(i386:x86-64)
ENTRY(_start)
SEARCH_DIR("/nix/store/kxf1p7l7lgm6j5mjzkiwcwzc98s9f1az-binutils-2.23.1/x86_64-unknown-linux-gnu/lib64"); SEARCH_DIR("/nix/store/kxf1p7l7lgm6j5mjzkiwcwzc98s9f1az-binutils-2.23.1/lib64"); SEARCH_DIR("/nix/store/kxf1p7l7lgm6j5mjzkiwcwzc98s9f1az-binutils-2.23.1/x86_64-unknown-linux-gnu/lib"); SEARCH_DIR("/nix/store/kxf1p7l7lgm6j5mjzkiwcwzc98s9f1az-binutils-2.23.1/lib");
SECTIONS
{
/* Read-only sections, merged into text segment: */
PROVIDE (__executable_start = SEGMENT_START("text-segment", 0x400000)); . = SEGMENT_START("text-segment", 0x400000) + SIZEOF_HEADERS;
.interp : { *(.interp) }
.note.gnu.build-id : { *(.note.gnu.build-id) }
.hash : { *(.hash) }
.gnu.hash : { *(.gnu.hash) }
.dynsym : { *(.dynsym) }
.dynstr : { *(.dynstr) }
.gnu.version : { *(.gnu.version) }
.gnu.version_d : { *(.gnu.version_d) }
.gnu.version_r : { *(.gnu.version_r) }
(snip)
In case you missed it: __executable_start = SEGMENT_START("text-segment", 0x400000))
.
And for me, sure enough, when I link a simple .o file into a binary, the entry point address is very close to 0x400000.
The entry point address in the ELF metadata is this value, plus the offset from the beginning of the .text
section to the _start
symbol. Note also that the _start
symbol can be configured. Again from my default linker script example: ENTRY(_start)
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With