Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

A valid pattern in assembly for variadic arguments

I think my question might seem a bit odd, but here it goes; I'm trying to create a program dynamically in C++ (mostly for the fun of it, but also for a programmatic reason) and it is not so hard as it might sound. To do this you have to use assembly in runtime like this:

byte * buffer = new byte[5];
*buffer = '0xE9'; // Code for 'jmp'
*(uint*)(buffer + 1) = 'address destination'; // Address to jump to

This is much easier than it might seem, because I target only one platform and compiler; GCC with Linux 32bit (and also only one calling convention, cdecl). So I'm trying to create a dynamic assembly function to redirect calls from triggers, so I can use class methods as callbacks (even with C API libraries (with cdecl of course)). I only need this to support pointers and native types (char, int, short etc...).

ANYTHING MyRedirect(ANY AMOUNT ARGUMENTS)
{
    return MyClassFunc('this', ANY AMOUNT ARGUMENTS);
}

The function above, is the one I want to create in pure assembly (in memory with C++). Since the function is very simple, its ASM is simple as well (depending on arguments).

55                      push   %ebp
89 e5                   mov    %esp,%ebp
83 ec 04                sub    $0x4,%esp
8b 45 08                mov    0x8(%ebp),%eax
89 04 24                mov    %eax,(%esp)
e8 00 00 00 00          call   <address>
c9                      leave
c3                      ret  

So in my program, I have created an ASM pattern generator (since I don't know ASM especially well, I search for patterns). This function can generate assembly code (in bytes, for the exact case above, i.e a function that redirects and returns) by specifying the amount of arguments the function needs. This is a snippet from my C++ code.

std::vector<byte> detourFunc(10 + stackSize, 0x90); // Base is 10 bytes + argument size

// This becomes 'push %ebp; move %esp, %ebp'
detourFunc.push_back(0x55);     // push %ebp
detourFunc.push_back(0x89);     // mov
detourFunc.push_back(0xE5);     // %esp, %ebp

// Check for arguments
if(stackSize != 0)
{
    detourFunc.push_back(0x83);     // sub
    detourFunc.push_back(0xEC);     // %esp
    detourFunc.push_back(stackSize);    // stack size required

    // If there are arguments, we want to push them
    // in the opposite direction (cdecl convention)
    for(int i = (argumentCount - 1); i >= 0; i--)
    {
        // This is what I'm trying to implement
        // ...
    }

    // Check if we need to add 'this'
    if(m_callbackClassPtr)
    {

    }
}

// This is our call operator
detourFunc.push_back(0xE8);     // call

// All nop, this will be replaced by an address
detourFunc.push_back(0x90);     // nop
detourFunc.push_back(0x90);     // nop
detourFunc.push_back(0x90);     // nop
detourFunc.push_back(0x90);     // nop

if(stackSize == 0)
{
    // In case of no arguments, just 'pop'
    detourFunc.push_back(0x5D); // pop %ebp
}

else 
{
    // Use 'leave' if we have arguments
    detourFunc.push_back(0xC9); // leave    
}

// Return function
detourFunc.push_back(0xC3);     // ret

If I specify zero as the stackSize this will be the output:

55                      push   %ebp
89 e5                   mov    %esp,%ebp
e8 90 90 90 90          call   <address>
5d                      pop    %ebp
c3                      ret   

As you can see, this is completely valid 32-bit ASM, and will act as the 'MyRedirect' if it had zero arguments and no need for a 'this' pointer. The problem is, I want to implement the part where it generates ASM code, depending on the amount of arguments I specify that the 'redirect' function will receive. I have successfully done this in my little C++ program of mine (cracked the pattern).

#include <stdio.h>
#include <stdlib.h>

int main(int argc, char * argv[])
{
    int val = atoi(argv[1]);

    printf("\tpush %%ebp\n");
    printf("\tmov %%esp,%%ebp\n");

    if(val == 0)
    {
        printf("\tcall <address>\n");
        printf("\tpop %%ebp\n");
    }

    else
    {
        printf("\tsub $0x%x,%%esp\n", val * sizeof(int));

        for(int i = val; i > 0; i--)
        {
            printf("\tmov 0x%x(%%ebp),%%eax\n", i * sizeof(int) + sizeof(int));
            printf("\tmov %%eax,0x%x(%%esp)\n", i * sizeof(int) - sizeof(int));
        }

        printf("\tcall <address>\n");
        printf("\tleave\n");
    }

    printf("\tret\n");
    return 0;
}

This function prints out the exact same pattern as the ASM code generate by 'objdump'. So my question is; will this be valid in all cases if I only want a redirect function as the one above, no matter the arguments, if it is only under Linux 32bit, or are there any pitfalls I need to know about? For example; would the generated ASM be different with 'shorts' or 'chars' or will this work (I've only tested with integers), and also if I call a function which returns 'void' (how would that affect the ASM)?

I might have explained everything a bit fuzzy, so please ask instead of any misunderstandings :)

NOTE: I do not want to know alternatives, I enjoy my current implementation and think it's a very interesting one, I would just highly appreciate your help on the subject.

EDIT: In case of interest, here are some dumps for the above C++ code: link

like image 657
Elliott Darfink Avatar asked May 05 '12 21:05

Elliott Darfink


1 Answers

As Dan suggests, you need to mark the memory as executable. I wrote some code you can use. (It works on GNU/Linux and Windows.) If you intend to never support ARM, x86-64, or other platforms, then I don't see any downfalls to your code (with the executable part added) and it seems that it should "always work." (Assuming everything else is working properly of course.)

#include <sys/mman.h>

...

n = <size of code buffer>;
p = mmap(0, n, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_ANON|MAP_PRIVATE, 0, 0);

'fish' suggested you use asmjit. I have to agree with that; it's more portable than your method. However, you said you are not interested in alternatives.

You may be interested in something called "Thunking" (kind of). It basically tries to accomplish the "replace a C callback with a C++ method." This is actually pretty useful, but is not really a good design for your applications.

Hope that helps.

like image 84
NotKyon Avatar answered Sep 27 '22 18:09

NotKyon