Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Object file to binary code

Tags:

c

gcc

ld

Let's suppose I have a C file with no external dependency, and only const data section. I would like to compile this file, and then get a binary blob I can load in another program, where the function would be used through a function pointer.

Let's take an example, here is a fictionnal binary module, f1.c

static const unsigned char mylut[256] = {
    [0 ... 127] = 0,
    [128 ... 255] = 1,
};

void f1(unsigned char * src, unsigned char * dst, int len)
{
    while(len) {
        *dst++ = mylut[*src++];
        len--;
    }
}

I would like to compile it to f1.o, then f1.bin, and use it like this in prog.c

int somefunc() {
    unsigned char  * codedata;
    f1_type_ptr  f1_ptr;
    /* open f1.bin, and read it into codedata */

    /* set function pointer to beginning of loaded data */
    f1_ptr =(f1_type_ptr)codedata;

    /* call !*/
    f1_ptr(src, dst, len);
}

I suppose going from f1.c to f1.o involves -fPIC to get position independance. What are the flags or linker script that I can use to go from f1.o to f1.bin ?

Clarification :

I know about dynamic linking. dynamic linking is not possible in this case. The linking step has to be cast func pointer to loaded data, if it is possible.

Please assume there is no OS support. If I could, I would for example write f1 in assembly with PC related adressing.

like image 888
shodanex Avatar asked Aug 27 '12 08:08

shodanex


People also ask

Is object file a binary file?

Object files. These files are produced as the output of the compiler. They consist of function definitions in binary form, but they are not executable by themselves.

Can you execute an object file?

An object file is a partial machine language program. It is designed to be linked to other object files to produce an executable file. You cannot run an object file by writing its name as a command.

What is object file in C++?

A C++ object file is an intermediate file produced by a C++ compiler from a C++ implementation file and the C++ header files that the implementation file includes. The C++ linker produces the output executable or library of your project from your C++ object files.

What is a .bin file?

What Is a . BIN File? The . BIN file format is actually designed to store information in a binary format. The binary formatting is compatible with disk storage and it allows media files to save and sit on the physical disc.


2 Answers

First of all, as other said you should consider using a DLL or SO.

That said, if you really want to do this, you need to replace the linker script. Something like this (not very well tested, but I think it works):

ENTRY(_dummy_start)
SECTIONS
{
    _dummy_start = 0;
    _GLOBAL_OFFSET_TABLE_ = 0;
    .all : { 
        _all = .;
        LONG(f1 - _all);
        *( .text .text.* .data .data.* .rodata .rodata.* ) 
    }
}

Then compile with:

$ gcc -c -fPIC test.c

Link with:

$ ld -T script.ld test.o -o test.elf

And extract the binary blob with:

$ objcopy -j .all -O binary test.elf test.bin

Probably some explanation of the script is welcome:

  • ENTRY(_dummy_start) That just avoids the warning about the program not having an entry point.
  • _dummy_start = 0; That defines the symbol used in the previous line. The value is not used.
  • _GLOBAL_OFFSET_TABLE_ = 0; That prevents another linker error. I don't think you really need this symbol, so it can be defined as 0.
  • .all That's the name of the section that will collect all the bytes of your blob. In this sample it will be all the .text, .data and .rodata sections together. You may need some more if you have complicated functions, in this case objdump -x test.o is your friend.
  • LONG(f1 - _all) Not really needed, but you want to know the offset of your function into the blob, don't you? You cannot assume that it will be at offset 0. With this line the very first 4 bytes in the blob will be the offset of the symbol f1 (your function). Change LONG with QUAD if using 64-bit pointers.

UPDATE: And now a quick'n'dirty test (it works!):

#include <stdio.h>
#include <stdlib.h>
#include <sys/mman.h>

typedef void (*f1_t)(char *a, char *b, int len);
f1_t f1;

int main()
{
    char *blob = (char*)valloc(4096);
    FILE *f = fopen("test.bin", "rb");
    fread(blob, 1, 4096, f);
    fclose(f);

    unsigned offs = *(unsigned*)blob;
    f1 = (f1_t)(blob + offs);
    mprotect(blob, 4096, PROT_READ | PROT_WRITE | PROT_EXEC);
    char txt[] = "¡hello world!";
    char txt2[sizeof(txt)] = "";
    f1(txt, txt2, sizeof(txt) - 1);
    printf("%s\n%s\n", txt, txt2);
    return 0;

}
like image 92
rodrigo Avatar answered Sep 28 '22 00:09

rodrigo


You should consider building a shared library (.dll for windows, or .so for linux).

Build the lib like this :

gcc -c -fPIC test.c
gcc -shared test.o -o libtest.so

If you want to load the library dynamically from your code, have a look at the functions dlopen(3) and dlsym(3).

Or if you want to link the library at the compile time, build the program with

gcc -c main.c
gcc main.o -o <binary name> -ltest

EDIT:

I'm really not sure about what I will say here, but this could give you a clue to progress in your research ...

If you don't want to use dlopen and dlsym, you can try to read the symbol table from the .o file in order to find the function address, and then, mmap the object file in memory with the read and execute rights. Then you should be able to execute the loaded code at the address you found. But be carefull with the other dependencies you could meet in this code.

You can check man page elf(5)

like image 24
phsym Avatar answered Sep 27 '22 23:09

phsym