Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C: Overwrite another function byte by byte

Tags:

c

function

byte

Let's suppose I have a function:

int f1(int x){
 // some more or less complicated operations on x
 return x;
}

And that I have another function

int f2(int x){
 // we simply return x
 return x;
}

I would like to be able to do something like the following:

char* _f1 = (char*)f1;
char* _f2 = (char*)f2;
int i;
for (i=0; i<FUN_LENGTH; ++i){
 f1[i] = f2[i];
}

I.e. I would like to interpret f1 and f2 as raw byte arrays and "overwrite f1 byte by byte" and thus, replace it by f2.

I know that usually callable code is write-protected, however, in my particular situation, you can simply overwrite the memory location where f1 is located. That is, I can copy the bytes over onto f1, but afterwards, if I call f1, the whole thing crashes.

So, is my approach possible in principle? Or are there some machine/implementation/whatsoever-dependent issues I have to take into consideration?

like image 373
phimuemue Avatar asked Jan 25 '12 21:01

phimuemue


3 Answers

It would be easier to replace the first few bytes of f1 with a machine jump instruction to the beginning of f2. That way, you won't have to deal with any possible code relocation issues.

Also, the information about how many bytes a function occupies (FUN_LENGTH in your question) is normally not available at runtime. Using a jump would avoid that problem too.

For x86, the relative jump instruction opcode you need is E9 (according to here). This is a 32-bit relative jump, which means you need to calculate the relative offset between f2 and f1. This code might do it:

int offset = (int)f2 - ((int)f1 + 5); // 5 bytes for size of instruction
char *pf1 = (char *)f1;
pf1[0] = 0xe9;
pf1[1] = offset & 0xff;
pf1[2] = (offset >> 8) & 0xff;
pf1[3] = (offset >> 16) & 0xff;
pf1[4] = (offset >> 24) & 0xff;

The offset is taken from the end of the JMP instruction, so that's why there is 5 added to the address of f1 in the offset calculation.

It's a good idea to step through the result with an assembly level debugger to make sure you're poking the correct bytes. Of course, this is all not standards compliant so if it breaks you get to keep both pieces.

like image 88
Greg Hewgill Avatar answered Oct 15 '22 05:10

Greg Hewgill


Your approach is undefined behavior for the C standard.

And on many operating systems (e.g. Linux), your example will crash: the function code is inside the read only .text segment (and section) of the ELF executable, and that segment is (sort-of) mmap-ed read-only by execve (or by dlopen or by the dynamic linker), so you cannot write inside it.

like image 39
Basile Starynkevitch Avatar answered Oct 15 '22 07:10

Basile Starynkevitch


Instead of trying to overwrite the function (which you've already found is fragile at best), I'd consider using a pointer to a function:

int complex_implementation(int x) { 
    // do complex stuff with x
    return x;
}

int simple_implementation(int x) { 
   return x;
}

int (*f1)(int) = complex_implementation;

You'd use this something like:

for (int i=0; i<limit; i++) {
    a = f1(a);
    if (whatever_condition)
        f1 = simple_implementation;
}

...and after the assignment, calling f1 would just return the input value.

Calling a function via a pointer does impose some overhead, but (thanks to that being common in OO languages) most compilers and CPUs do a pretty good job of minimizing that overhead.

like image 34
Jerry Coffin Avatar answered Oct 15 '22 05:10

Jerry Coffin