Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to use libelf to generate an ELF file for my own compiler?

I have generated code for a simpler compiler I'm writing, and I was wondering how to place that code into an ELF file?

I've tried using libelf, but I can't seem to wrap my head around how to organize the tables.

I'm not using any data, so I assume all I need is a .text section.

If I had a buffer of generated x86 code, how would I create an ELF file with just a simple .text section that could be runnable?

like image 400
user1814062 Avatar asked Nov 10 '12 06:11

user1814062


People also ask

What kind of file is an ELF file?

In computing, the Executable and Linkable Format (ELF, formerly named Extensible Linking Format), is a common standard file format for executable files, object code, shared libraries, and core dumps.

How can I tell if a file is ELF?

Read the first four bytes. If they are equal to \x7fELF , it's an ELF file. Otherwise, you should parse it as COFF and see if it makes sense.

Are ELF files object files?

This chapter describes the object file format, called ELF (Executable and Linking Format). There are three main types of object files. A relocatable file holds code and data suitable for linking with other object files to create an executable or a shared object file.


1 Answers

Short Answer

You can't !

The functionality you are looking for is actually part of a build tool called "linker". Despite the fact that besides some unresolved symbol errors it throws from time to time its presence often goes unnoticed it's one of the most important components of any build-chain.

Long Answer

Here some ideas how to proceed in to somehow get your binary to run.

Restrictions

Any of the methods described below will only work if

  • The machine code doesn't contain any jumps with absolute addresses as patching them to the right destinations would require relocation info.

  • The program starts at the very beginning of the binary file used as input

    This should be easy to circumvent either by adding an additional (relative) jump instruction to the "right" spot at the start of the file of by using an offset into the binary data.

Suggested Workaround

In case of a very simple "self-contained" contained binary given just by a bunch of raw machine instructions without any external dependencies and without (!) any absolute jump instructions instead of doing it by hand it might be easier to just use an already existing linker instead.

Given a file consisting of the raw machine instructions (main.bin in the following example) the first step would involve generating a shared object (main.o in the example) from it:

objcopy -I binary -B i386 -O elf32-i386 --rename-section .data=.text main.bin main.o

Taking a look at the generated objects symbol table readelf -S:

Symbol table '.symtab' contains 5 entries:
   Num:    Value  Size Type    Bind   Vis      Ndx Name
     0: 00000000     0 NOTYPE  LOCAL  DEFAULT  UND 
     1: 00000000     0 SECTION LOCAL  DEFAULT    1 
     2: 00000000     0 NOTYPE  GLOBAL DEFAULT    1 _binary_main_bin_start
     3: 0000000c     0 NOTYPE  GLOBAL DEFAULT    1 _binary_main_bin_end
     4: 0000000c     0 NOTYPE  GLOBAL DEFAULT  ABS _binary_main_bin_size

You'll notice that the symbols _binary_..._start, _binary_..._end and _binary_..._size according to start, end and size of the input file were added. These can be used to hand the entry point to the executable down to the linker.

ld --entry=_binary_main_bin_start main.o -o main

should produce the executable you are looking for.

Manual Generation

Alternatively you might want to manually create an elf file just containing necessary information to get a running executable.

If you're not too familiar with the elf format you might want to take a look at the specs (available on: http://refspecs.linuxfoundation.org/). Also the manual page (man elf) is very exhaustive, so this might be good source of information too.

To keep it most simple the goal will be to just use what's absolutely necessary.

Taking a look into the specs you'll see the only component required under any circumstances is the elf header. A section header table is only required for shared objects, a program header table only for executables.

As we want to create an executable we'll only use the program header table with one single entry of type PT_LOAD describing the whole memory layout of the executable.

To meet alignment constrains the process image will contain the whole contents of the binary.(source: man elf).

... Loadable process segments must have congruent values for p_vaddr and p_offset, modulo the page size.

This being said it should be clear why the final layout of the elf file will look like this:

struct Binary {
  Elf32_Ehdr ehdr;
  Elf32_Phdr phdr;
  char code[];
};

Most fields of Elf32_Ehdr an Elf32_Phdr are fixed, so they can already be set in the initializer. The only fields that require later adjustments are the fields describing the sizes (.p_filesz and .p_memsz) of the loaded segment in the program header table entry.

Taking input from stdin and writing to stdout (thus used like ./a.out <main.bin >executable) this is the way the described setup could be implemented:

#include <stdio.h>
#include <stddef.h>
#include <elf.h>
#include <string.h>
#include <stdlib.h>

#define BUFFER_SIZE 1024
char buffer[BUFFER_SIZE];

void *read_all (int *filesize) {
  void *data = NULL;
  int offset = 0;
  int size = 0;

  while ((size = fread (buffer, 1, sizeof (buffer), stdin)) > 0) {
    if ((data = realloc (data, offset + size)) == NULL)
      exit (-1);
    memcpy (data + offset, buffer, size);
    offset += size;
  }
  *filesize = offset;
  return data;
}


#define LOAD_ADDRESS 0x8048000

struct Binary {
  Elf32_Ehdr ehdr;
  Elf32_Phdr phdr;
  char code[];
};

int main (int argc, char *argv[]) {

  void *code;
  int code_size;

  struct Binary binary = {
    /* ELF HEADER */
    .ehdr = {
      /* general */
      .e_ident   = {
        ELFMAG0, ELFMAG1, ELFMAG2, ELFMAG3,
        ELFCLASS32, 
        ELFDATA2LSB,
        EV_CURRENT,
        ELFOSABI_LINUX,
      },
      .e_type    = ET_EXEC,
      .e_machine = EM_386,
      .e_version = EV_CURRENT,
      .e_entry   = LOAD_ADDRESS + (offsetof (struct Binary, code)),
      .e_phoff   = offsetof (struct Binary, phdr),
      .e_shoff   = 0,
      .e_flags   = 0,
      .e_ehsize   = sizeof (Elf32_Ehdr),
      /* program header */
      .e_phentsize = sizeof (Elf32_Phdr),
      .e_phnum     = 1,
      /* section header */
      .e_shentsize = sizeof (Elf32_Shdr),
      .e_shnum     = 0,
      .e_shstrndx  = 0
    },

    /* PROGRAM HEADER */
    .phdr = {
      .p_type   = PT_LOAD,
      .p_offset = 0,
      .p_vaddr = LOAD_ADDRESS,
      .p_paddr = LOAD_ADDRESS,
      .p_filesz = 0,
      .p_memsz = 0,
      .p_flags = PF_R | PF_X,
      .p_align = 0x1000
    }
  };

  if ((code = read_all (&code_size)) == NULL)
    return -1;

  /* fix program header */
  binary.phdr.p_filesz = sizeof (struct Binary) + code_size;
  binary.phdr.p_memsz = sizeof (struct Binary) + code_size;

  /* write binary */
  fwrite (&binary, sizeof (struct Binary), 1, stdout);
  fwrite (code, 1, code_size, stdout);

  free (code);

  return 0;
}
like image 93
mikyra Avatar answered Sep 28 '22 10:09

mikyra