Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Detouring and GCC inline assembly (Linux)

I'm programming extensions for a game which offers an API for (us) modders. This API offers a wide variety of things, but it has one limitation. The API is for the 'engine' only, which means that all modifications (mods) that has been released based on the engine, does not offer/have any sort of (mod specific) API. I have created a 'signature scanner' (note: my plugin is loaded as a shared library, compiled with -share & -fPIC) which finds the functions of interest (which is easy since I'm on linux). So to explain, I'll take a specific case: I have found the address to a function of interest, its function header is very simpleint * InstallRules(void);. It takes a nothing (void) and returns an integer pointer (to an object of my interest). Now, what I want to do, is to create a detour (and remember that I have the start address of the function), to my own function, which I would like to behave something like this:

void MyInstallRules(void)
{
    if(PreHook() == block) // <-- First a 'pre' hook which can block the function
        return;
    int * val = InstallRules(); // <-- Call original function
    PostHook(val); // <-- Call post hook, if interest of original functions return value
}

Now here's the deal; I have no experience what so ever about function hooking, and I only have a thin knowledge of inline assembly (AT&T only). The pre-made detour packages on the Internet is only for windows or is using a whole other method (i.e preloads a dll to override the orignal one). So basically; what should I do to get on track? Should I read about call conventions (cdecl in this case) and learn about inline assembly, or what to do? The best would probably be a already functional wrapper class for linux detouring. In the end, I would like something as simple as this:

void * addressToFunction = SigScanner.FindBySig("Signature_ASfs&43"); // I've already done this part
void * original = PatchFunc(addressToFunction, addressToNewFunction); // This replaces the original function with a hook to mine, but returns a pointer to the original function (relocated ofcourse)
// I might wait for my hook to be called or whatever
// ....

// And then unpatch the patched function (optional)
UnpatchFunc(addressToFunction, addressToNewFunction);

I understand that I won't be able to get a completely satisfying answer here, but I would more than appreciate some help with the directions to take, because I am on thin ice here... I have read about detouring but there is barely any documentation at all (specifically for linux), and I guess I want to implement what's known as a 'trampoline' but I can't seem to find a way how to acquire this knowledge.

NOTE: I'm also interested in _thiscall, but from what I've read that isn't so hard to call with GNU calling convention(?)

like image 708
Elliott Darfink Avatar asked May 02 '12 19:05

Elliott Darfink


1 Answers

Is this project to develop a "framework" that will allow others to hook different functions in different binaries? Or is it just that you need to hook this specific program that you have?

First, let's suppose you want the second thing, you just have a function in a binary that you want to hook, programmatically and reliably. The main problem with doing this universally is that doing this reliably is a very tough game, but if you are willing to make some compromises, then it's definitely doable. Also let's assume this is x86 thing.

If you want to hook a function, there are several options how to do it. What Detours does is inline patching. They have a nice overview of how it works in a Research PDF document. The basic idea is that you have a function, e.g.

00E32BCE  /$ 8BFF           MOV EDI,EDI
00E32BD0  |. 55             PUSH EBP
00E32BD1  |. 8BEC           MOV EBP,ESP
00E32BD3  |. 83EC 10        SUB ESP,10
00E32BD6  |. A1 9849E300    MOV EAX,DWORD PTR DS:[E34998]
...
...

Now you replace the beginning of the function with a CALL or JMP to your function and save the original bytes that you overwrote with the patch somewhere:

00E32BCE  /$ E9 XXXXXXXX    JMP MyHook
00E32BD3  |. 83EC 10        SUB ESP,10
00E32BD6  |. A1 9849E300    MOV EAX,DWORD PTR DS:[E34998]

(Note that I overwrote 5 bytes.) Now your function gets called with the same parameters and same calling convention as the original function. If your function wants to call the original one (but it doesn't have to), you create a "trampoline", that 1) runs the original instructions that were overwritten 2) jmps to the rest of the original function:

Trampoline:
    MOV EDI,EDI
    PUSH EBP
    MOV EBP,ESP
    JMP 00E32BD3

And that's it, you just need to construct the trampoline function in runtime by emitting processor instructions. The hard part of this process is to get it working reliably, for any function, for any calling convention and for different OS/platforms. One of the issues is that if the 5 bytes that you want to overwrite ends in a middle of an instruction. To detect "ends of instructions" you would basically need to include a disassembler, because there can be any instruction at the beginning of the function. Or when the function is itself shorter than 5 bytes (a function that always returns 0 can be written as XOR EAX,EAX; RETN which is just 3 bytes).

Most current compilers/assemblers produce a 5-byte long function prolog, exactly for this purpose, hooking. See that MOV EDI, EDI? If you wonder, "why the hell do they move edi to edi? that doesn't do anything!?" you are absolutely correct, but this is the purpose of the prolog, to be exactly 5-bytes long (not ending in a middle of an instruction). Note that the disassembly example is not something I made up, it's calc.exe on Windows Vista.

The rest of the hook implementation is just technical details, but they can bring you many hours of pain, because that's the hardest part. Also the behaviour you described in your question:

void MyInstallRules(void)
{
    if(PreHook() == block) // <-- First a 'pre' hook which can block the function
        return;
    int * val = InstallRules(); // <-- Call original function
    PostHook(val); // <-- Call post hook, if interest of original functions return value
}

seems worse than what I described (and what Detours does), for example you might want to "not call the original" but return some different value. Or call the original function twice. Instead, let your hook handler decide whether and where it will call the original function. Also then you don't need two handler functions for a hook.

If you don't have enough knowledge about the technologies you need for this (mostly assembly), or don't know how to do the hooking, I suggest you study what Detours does. Hook your own binary and take a debugger (OllyDbg for example) to see at assembly level what it exactly did, what instructions were placed and where. Also this tutorial might come in handy.

Anyway, if your task is to hook some functions in a specific program, then this is doable and if you have any trouble, just ask here again. Basically you can do a lot of assumptions (like the function prologs or used conventions) that will make your task much easier.

If you want to create some reliable hooking framework, then still is a completely different story and you should first begin by creating simple hooks for some simple apps.

Also note that this technique is not OS specific, it's the same on all x86 platforms, it will work on both Linux and Windows. What is OS specific is that you will probably have to change memory protection of the code ("unlock" it, so you can write to it), which is done with mprotect on Linux and with VirtualProtect on Windows. Also the calling conventions are different, that that's what you can solve by using the correct syntax in your compiler.

Another trouble is "DLL injection" (on Linux it will probably be called "shared library injection" but the term DLL injection is widely known). You need to put your code (that performs the hook) into the program. My suggestion is that if it's possible, just use LD_PRELOAD environment variable, in which you can specify a library that will be loaded into the program just before it's run. This has been described in SO many times, like here: What is the LD_PRELOAD trick?. If you must do this in runtime, I'm afraid you will need to get with gdb or ptrace, which in my opinion is quite hard (at least the ptrace thing) to do. However you can read for example this article on codeproject or this ptrace tutorial.

I also found some nice resources:

  • SourceHook project, but it seems it's only for virtual functions in C++, but you can always take a look at its source code
  • this forum thread giving a simple 10-line function to do this "inline hook" that I described
  • this a little more complex code in a forum
  • here on SO is some example

Also one other point: This "inline patching" is not the only way to do this. There are even simpler ways, e.g. if the function is virtual or if it's a library exported function, you can skip all the assembly/disassembly/JMP thing and simply replace the pointer to that function (either in the table of virtual functions or in the exported symbols table).

like image 140
kuba Avatar answered Sep 25 '22 02:09

kuba