Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Reverse engineer optimized c code from assembly

The point of this problem is to reverse engineer c code that was made after running the compiler with level 2 optimization. The original c code is as follows (computes the greatest common divisor):

int gcd(int a, int b){
    int returnValue = 0;
    if (a != 0 &&  b != 0){
        int r;
        int flag = 0;
        while (flag == 0){
            r = a % b;
            if (r ==0){
                flag = 1;
            } else {
                a = b;
                b = r;
            }
        }
        returnValue = b;
    }
    return(returnValue);
}

when I ran the optimized compile I ran this from the command line:

gcc -O2 -S Problem04b.c

to get the assembly file for this optimized code

.gcd:
    .LFB12:
        .cfi_startproc
        testl   %esi, %esi
        je  .L2
        testl   %edi, %edi
        je  .L2
    .L7:
        movl    %edi, %edx
        movl    %edi, %eax
        movl    %esi, %edi
        sarl    $31, %edx
        idivl   %esi
        testl   %edx, %edx
        jne .L9
        movl    %esi, %eax
        ret
        .p2align 4,,10
        .p2align 3
    .L2:
        xorl    %esi, %esi
        movl    %esi, %eax
        ret
        .p2align 4,,10
        .p2align 3
    .L9:
        movl    %edx, %esi
        jmp .L7
        .cfi_endproc

I need to convert this assembly code back to c code here is where I am at right now:

int gcd(int a int b){
    /*
       testl %esi %esi
       sets zero flag if a is 0 (ZF) but doesn't store anything
       */
    if (a == 0){
        /*
           xorl %esi %esi
           sets the value of a variable to 0. More compact than movl
           */
        int returnValue = 0;
        /*
           movl %esi %eax
           ret

           return the value just assigned
           */
        return(returnValue);
    }
    /*
       testl %edi %edi
       sets zero flag if b is 0 (ZF) but doesn't store anything
       */
    if (b == 0){
        /*
           xorl %esi %esi
           sets the value of a variable to 0. More compact than movl
           */
        int returnValue = 0;
        /*
           movl %esi %eax
           ret

           return the value just assigned
           */
        return(returnValue);
    }

    do{
        int r = b;
        int returnValue = b;

    }while();


}

Can anyone help me write this back in to c code? I'm pretty much lost.

like image 718
scottybobby Avatar asked Feb 19 '15 16:02

scottybobby


1 Answers

First of all, you have the values mixed in your code. %esi begins with the value b and %edi begins with the value a.

You can infer from the testl %edx, %edx line that %edx is used as the condition variable for the loop beginning with .L7 (if %edx is different from 0 then control is transferred to the .L9 block and then returned to .L7). We'll refer to %edx as remainder in our reverse-engineered code.


Let's begin reverse-engineering the main loop:

movl    %edi, %edx

Since %edi stores a, this is equivalent to initializing the value of remainder with a: int remainder = a;.

movl    %edi, %eax

Store int temp = a;

movl    %esi, %edi

Perform int a = b; (remember that %edi is a and %esi is b).

sarl $31, %edx

This arithmetic shift instruction shifts our remainder variable 31 bits to the right whilst maintaining the sign of the number. By shifting 31 bits you're setting remainder to 0 if it's positive (or zero) and to -1 if it's negative. So it's equivalent to remainder = (remainder < 0) ? -1 : 0.

idivl %esi

Divide %edx:%eax by %esi, or in our case, divide remainder * temp by b (the variable). The remainder will be stored in %edx, or in our code, remainder. When combining this with the previous instruction: if remainder < 0 then remainder = -1 * temp % b, and otherwise remainder = temp % b.

testl   %edx, %edx
jne .L9

Check to see if remainder is equal to 0 - if it's not, jump to .L9. The code there simply sets b = remainder; before returning to .L7. In order to implement this in C, we'll keep a count variable that will store the amount of times the loop has iterated. We'll perform b = remainder at the beginning of the loop but only after the first iteration, meaning when count != 0.

We're now ready to build our full C loop:

int count = 0;
do {
    if (count != 0)
        b = remainder;
    remainder = a;
    temp = a;
    a = b;
    if (remainder < 0){
        remainder = -1 * temp % b;
    } else {
        remainder = temp % b;
    }

    count++;
} while (remainder != 0)

And after the loop terminates,

movl    %esi, %eax
ret

Will return the GCD that the program computed (in our code it'll be stored in the b variable).

like image 175
Daniel Kleinstein Avatar answered Oct 05 '22 01:10

Daniel Kleinstein