Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Trying to reverse engineer a function

I'm trying to understand assembly in x86 more. I have a mystery function here that I know returns an int and takes an int argument. So it looks like int mystery(int n){}. I can't figure out the function in C however. The assembly is:

mov  %edi, %eax
lea  0x0(,%rdi, 8), %edi
sub  %eax, %edi
add  $0x4, %edi
callq < mystery _util >
repz retq

< mystery _util >
mov  %edi, %eax
shr  %eax
and  $0x1, %edi
and  %edi, %eax
retq

I don't understand what the lea does here and what kind of function it could be.

like image 559
user9473612 Avatar asked Dec 11 '25 08:12

user9473612


1 Answers

The assembly code appeared to be computer generated, and something that was probably compiled by GCC since there is a repz retq after an unconditional branch (call). There is also an indication that because there isn't a tail call (jmp) instead of a call when going to mystery_util that the code was compiled with -O1 (higher optimization levels would likely inline the function which didn't happen here). The lack of frame pointers and extra load/stores indicated that it isn't compiled with -O0

Multiplying x by 7 is the same as multiplying x by 8 and subtracting x. That is what the following code is doing:

lea  0x0(,%rdi, 8), %edi
sub  %eax, %edi

LEA can compute addresses but it can be used for simple arithmetic as well. The syntax for a memory operand is displacement(base, index, scale). Scale can be 1, 2, 4, 8. The computation is displacement + base + index * scale. In your case lea 0x0(,%rdi, 8), %edi is effectively EDI = 0x0 + RDI * 8 or EDI = RDI * 8. The full calculation is n * 7 - 4;

The calculation for mystery_util appears to simply be

n &= (n>>1) & 1;

If I take all these factors together we have a function mystery that passes n * 7 - 4 to a function called mystery_util that returns n &= (n>>1) & 1.

Since mystery_util returns a single bit value (0 or 1) it is reasonable that bool is the return type.

I was curious if I could get a particular version of GCC with optimization level 1 (-O1) to reproduce this assembly code. I discovered that GCC 4.9.x will yield this exact assembly code for this given C program:

#include<stdbool.h>

bool mystery_util(unsigned int n)
{
    n &= (n>>1) & 1;
    return n;
}

bool mystery(unsigned int n)
{
    return mystery_util (7*n+4);
}

The assembly output is:

mystery_util:
        movl    %edi, %eax
        shrl    %eax
        andl    $1, %edi
        andl    %edi, %eax
        ret
mystery:
        movl    %edi, %eax
        leal    0(,%rdi,8), %edi
        subl    %eax, %edi
        addl    $4, %edi
        call    mystery_util
        rep ret

You can play with this code on godbolt.


Important Update - Version without bool

I apparently erred in interpreting the question. I assumed the person asking this question determined by themselves that the prototype for mystery was int mystery(int n). I thought I could change that. According to a related question asked on Stackoverflow a day later, it seems int mystery(int n) is given to you as the prototype as part of the assignment. This is important because it means that a modification has to be made.

The change that needs to be made is related to mystery_util. In the code to be reverse engineered are these lines:

mov  %edi, %eax
shr  %eax

EDI is the first parameter. SHR is logical shift right. Compilers would only generate this if EDI was an unsigned int (or equivalent). int is a signed type an would generate SAR (arithmetic shift right). This means that the parameter for mystery_util has to be unsigned int (and it follows that the return value is likely unsigned int. That means the code would look like this:

unsigned int mystery_util(unsigned int n)
{
    n &= (n>>1) & 1;
    return n;
}

int mystery(int n)
{
    return mystery_util (7*n+4);
}

mystery now has the prototype given by your professor (bool is removed) and we use unsigned int for the parameter and return type of mystery_util. In order to generate this code with GCC 4.9.x I found you need to use -O1 -fno-inline. This code can be found on godbolt. The assembly output is the same as the version using bool.

If you use unsigned int mystery_util(int n) you would discover that it doesn't quite output what we want:

mystery_util:
        movl    %edi, %eax
        sarl    %eax          ; <------- SAR (arithmetic shift right) is not SHR
        andl    $1, %edi
        andl    %edi, %eax
        ret
like image 136
Michael Petch Avatar answered Dec 14 '25 07:12

Michael Petch



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!