I have generated code for a simpler compiler I'm writing, and I was wondering how to place that code into an ELF file?
I've tried using libelf, but I can't seem to wrap my head around how to organize the tables.
I'm not using any data, so I assume all I need is a .text
section.
If I had a buffer of generated x86 code, how would I create an ELF file with just a simple .text
section that could be runnable?
In computing, the Executable and Linkable Format (ELF, formerly named Extensible Linking Format), is a common standard file format for executable files, object code, shared libraries, and core dumps.
Read the first four bytes. If they are equal to \x7fELF , it's an ELF file. Otherwise, you should parse it as COFF and see if it makes sense.
This chapter describes the object file format, called ELF (Executable and Linking Format). There are three main types of object files. A relocatable file holds code and data suitable for linking with other object files to create an executable or a shared object file.
You can't !
The functionality you are looking for is actually part of a build tool called "linker". Despite the fact that besides some unresolved symbol errors it throws from time to time its presence often goes unnoticed it's one of the most important components of any build-chain.
Here some ideas how to proceed in to somehow get your binary to run.
Any of the methods described below will only work if
The machine code doesn't contain any jumps with absolute addresses as patching them to the right destinations would require relocation info.
The program starts at the very beginning of the binary file used as input
This should be easy to circumvent either by adding an additional (relative) jump instruction to the "right" spot at the start of the file of by using an offset into the binary data.
In case of a very simple "self-contained" contained binary given just by a bunch of raw machine instructions without any external dependencies and without (!) any absolute jump instructions instead of doing it by hand it might be easier to just use an already existing linker instead.
Given a file consisting of the raw machine instructions (main.bin in the following example) the first step would involve generating a shared object (main.o in the example) from it:
objcopy -I binary -B i386 -O elf32-i386 --rename-section .data=.text main.bin main.o
Taking a look at the generated objects symbol table readelf -S
:
Symbol table '.symtab' contains 5 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 00000000 0 NOTYPE LOCAL DEFAULT UND
1: 00000000 0 SECTION LOCAL DEFAULT 1
2: 00000000 0 NOTYPE GLOBAL DEFAULT 1 _binary_main_bin_start
3: 0000000c 0 NOTYPE GLOBAL DEFAULT 1 _binary_main_bin_end
4: 0000000c 0 NOTYPE GLOBAL DEFAULT ABS _binary_main_bin_size
You'll notice that the symbols _binary_..._start
, _binary_..._end
and _binary_..._size
according to start, end and size of the input file were added.
These can be used to hand the entry point to the executable down to the linker.
ld --entry=_binary_main_bin_start main.o -o main
should produce the executable you are looking for.
Alternatively you might want to manually create an elf file just containing necessary information to get a running executable.
If you're not too familiar with the elf format you might want to take a look at the specs (available on: http://refspecs.linuxfoundation.org/). Also the manual page (man elf
) is very exhaustive, so this might be good source of information too.
To keep it most simple the goal will be to just use what's absolutely necessary.
Taking a look into the specs you'll see the only component required under any circumstances is the elf header. A section header table is only required for shared objects, a program header table only for executables.
As we want to create an executable we'll only use the program header table with one single entry of type PT_LOAD
describing the whole memory layout of the executable.
To meet alignment constrains the process image will contain the whole contents of the binary.(source: man elf).
... Loadable process segments must have congruent values for p_vaddr and p_offset, modulo the page size.
This being said it should be clear why the final layout of the elf file will look like this:
struct Binary {
Elf32_Ehdr ehdr;
Elf32_Phdr phdr;
char code[];
};
Most fields of Elf32_Ehdr an Elf32_Phdr are fixed, so they can already be set in the initializer. The only fields that require later adjustments are the fields describing the sizes (.p_filesz and .p_memsz) of the loaded segment in the program header table entry.
Taking input from stdin and writing to stdout (thus used like ./a.out <main.bin >executable
) this is the way the described setup could be implemented:
#include <stdio.h>
#include <stddef.h>
#include <elf.h>
#include <string.h>
#include <stdlib.h>
#define BUFFER_SIZE 1024
char buffer[BUFFER_SIZE];
void *read_all (int *filesize) {
void *data = NULL;
int offset = 0;
int size = 0;
while ((size = fread (buffer, 1, sizeof (buffer), stdin)) > 0) {
if ((data = realloc (data, offset + size)) == NULL)
exit (-1);
memcpy (data + offset, buffer, size);
offset += size;
}
*filesize = offset;
return data;
}
#define LOAD_ADDRESS 0x8048000
struct Binary {
Elf32_Ehdr ehdr;
Elf32_Phdr phdr;
char code[];
};
int main (int argc, char *argv[]) {
void *code;
int code_size;
struct Binary binary = {
/* ELF HEADER */
.ehdr = {
/* general */
.e_ident = {
ELFMAG0, ELFMAG1, ELFMAG2, ELFMAG3,
ELFCLASS32,
ELFDATA2LSB,
EV_CURRENT,
ELFOSABI_LINUX,
},
.e_type = ET_EXEC,
.e_machine = EM_386,
.e_version = EV_CURRENT,
.e_entry = LOAD_ADDRESS + (offsetof (struct Binary, code)),
.e_phoff = offsetof (struct Binary, phdr),
.e_shoff = 0,
.e_flags = 0,
.e_ehsize = sizeof (Elf32_Ehdr),
/* program header */
.e_phentsize = sizeof (Elf32_Phdr),
.e_phnum = 1,
/* section header */
.e_shentsize = sizeof (Elf32_Shdr),
.e_shnum = 0,
.e_shstrndx = 0
},
/* PROGRAM HEADER */
.phdr = {
.p_type = PT_LOAD,
.p_offset = 0,
.p_vaddr = LOAD_ADDRESS,
.p_paddr = LOAD_ADDRESS,
.p_filesz = 0,
.p_memsz = 0,
.p_flags = PF_R | PF_X,
.p_align = 0x1000
}
};
if ((code = read_all (&code_size)) == NULL)
return -1;
/* fix program header */
binary.phdr.p_filesz = sizeof (struct Binary) + code_size;
binary.phdr.p_memsz = sizeof (struct Binary) + code_size;
/* write binary */
fwrite (&binary, sizeof (struct Binary), 1, stdout);
fwrite (code, 1, code_size, stdout);
free (code);
return 0;
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With