Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

I need to recover my source code from the executable

Tags:

c

disassembly

It's the middle of the night, and I've accidently overwritten all my work by typing

gcc source.c -o source.c

I still have the original binary and my only hope is to dissemble it, but I don't know how or what the best tool to use to get the most readable result. I know this probably isn't the right place to post but I'm stressing out. Can someone help me out please?

like image 371
aytee17 Avatar asked Dec 07 '22 19:12

aytee17


1 Answers

Thanks for uploading the file. As I suspected, it was unstripped so the function names remained. Besides standard boilerplate code I could identify functions main, register_broker, connect_exchange (unused and empty) and handle_requests.

I spent a bit of time in IDA Pro and it wasn't too hard to recover the main() function. First, here's the original, unmodified listing of main() from IDA: http://pastebin.com/sBxhRJMM

To proceed, you need to familiarize yourself with AMD64 calling convention. To summarize, the first four arguments are passed in RDI(EDI), RSI(ESI), RDX(EDX) and RCX(ECX). The rest is passed on the stack, but all calls in main() use only up to four arguments so we don't need to worry about that.

IDA has helpfully labeled arguments of the standard C functions and even renamed some local variables. However, it can be improved and commented further. For example, since we're in main(), we know that argc (first argument) comes from EDI (since it's an int meaning 32-bit, it uses only the low half of RDI) and argv comes from RSI (it's a pointer so it uses the full 8 bytes of the register). So, we can rename the local variables into which EDI and RSI are copied:

mov     [rbp+argc], edi
mov     [rbp+argv], rsi

Next is a simple conditional block:

cmp     [rbp+argc], 2
jz      short loc_400EB3            
mov     rax, cs:stderr@@GLIBC_2_2_5 
mov     rdx, rax                    
mov     eax, offset aUsage ; "Usage"
mov     rcx, rdx        ; s         
mov     edx, 5          ; n         
mov     esi, 1          ; size      
mov     rdi, rax        ; ptr       
call    _fwrite                     
mov     edi, 1          ; status    
call    _exit                       

Here we compare argc with 2, and if it is equal, we jump further in the code. If it is not equal, we call fwrite(). The first argument to it is in rdi, and rdi is loaded from rax, which holds the address of a constant string "Usage". The second argument is in esi and is 1, the third in edx and is 5, the fourth in rcx, which is loaded from rdx which has the value of stderr@@GLIBC_2_2_5, which is basically a fancy reference to the stderr variable from libc. Stringing it all up together, we get:

fwrite("Usage", 1, 5, stderr);

From my experience, I can say that most likely it is an inlined fprintf, since 5 is exactly the length of the string. I.e. the original code probably was:

fprintf(stderr, "Usage");

Next call is a simple exit(1);. Combining both with the comparison, we get:

if ( argc != 2 )
{
  fprintf(stderr, "Usage");
  exit(1);
}

Continuing in this vein, we can identify other calls and variables they use. It's somewhat tedious to describe it all, so I uploaded a commented version of the disassembly, where I tried to show the equivalent C code for each call. You can see it here: http://pastebin.com/p5sRSwgQ

From that commented version it's not very hard to imagine a possible version of main():

int main(int argc, char **argv)
{
  if ( argc != 2 )
  {
    fprintf(stderr, "Usage");
    exit(1);
  }
  char name[256];
  gethostname(name, sizeof(name));
  struct hostent* _hostent = gethostbyname(name);
  struct in_addr *_addr0 = (struct in_addr *)(_hostent->h_addr_list[0]);
  struct sockaddr_in addr;
  addr.sin_family = AF_INET;
  addr.sin_port = htons(0);
  addr.sin_addr.s_addr = _addr0->s_addr;
  char *tmp = (char *)malloc(6);
  sprintf(tmp, "%d", addr.sin_port);
  char *ip_str = inet_ntoa(*_addr0);
  char *newbuf = (char *)malloc(strlen(argv[1]) + strlen(ip_str) + strlen(tmp) + 5);
  strcpy(newbuf, "r");
  strcat(newbuf, " ");
  strcat(newbuf, argv[1]);
  strcat(newbuf, " ");
  strcat(newbuf, ip_str);
  strcat(newbuf, " ");
  strcat(newbuf, tmp);
  register_broker(newbuf);
  int fd = socket(PF_INET, SOCK_STREAM, 0);
  if ( fd < 0 )
  {
    perror("Error creating socket");
    exit(1);
  }
  if ( bind(fd, (struct sockaddr*)&addr, sizeof(addr)) != 0 )
  {
    perror("Error binding socket");
    exit(1);
  }
  if ( listen(fd, 0x80) != 0 )
  {
    perror("Error listening on socket");
    exit(1);
  }
  handle_requests(fd);
}

Recovering the other two functions is left an exercise for the reader :)

like image 109
Igor Skochinsky Avatar answered Dec 09 '22 08:12

Igor Skochinsky