Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I explain the behavior of the following shellcode exploit?

This is a shellcode to exploit the bufferoverflow vulnerability. It sets the setuid(0) and spawns a shell using execve(). Below is the way I have interpreted it:

xor    %ebx,%ebx       ; Xoring to make ebx value 0
lea    0x17(%ebx),%eax ; adds 23 to 0 and loads effective addr to eax. for setuid()
int    $0x80           ; interrupt
push   %ebx            ; push ebx
push   $0x68732f6e     ; push address // why this address only????
push   $0x69622f2f     ; push address // same question
mov    %esp,%ebx
push   %eax
push   %ebx
mov    %esp,%ecx
cltd                   ; mov execve sys call into al
mov    $0xb,%al
int    $0x80           ; interrupt

Can anyone explain the entire steps clearly?

like image 294
Vinod K Avatar asked Nov 08 '10 18:11

Vinod K


People also ask

Which of the following best describes Shellcode?

Shellcode is a set of instructions that executes a command in software to take control of or exploit a compromised machine.

What are shellcode attacks?

What is Shellcode? Shellcode is a special type of code injected remotely which hackers use to exploit a variety of software vulnerabilities. It is so named because it typically spawns a command shell from which attackers can take control of the affected system.

How is shellcode executed?

Shellcode cannot be executed directly. In order to analyze what a shellcode attempts to do it must be loaded into another process. One common analysis technique is to write a small C program which holds the shellcode as a byte buffer, and then use a function pointer or use inline assembler to transfer execution to it.

What is return address in buffer overflow?

The return address affects where the program should jump to when the function returns. If the return address field is modified due to a buffer overflow, when the function returns, it will return to a new place.


1 Answers

int is the opcode for triggering a software interrupt. Software interrupts are numbered (from 0 to 255) and handled by the kernel. On Linux systems, interrupt 128 (0x80) is the conventional entry point for system calls. The kernel expects the system call arguments in the registers; in particular, the %eax register identifies which system call we are talking about.

  1. Set %ebx to 0
  2. Compute %ebx+23 and store the result in %eax (the opcode is lea as "load effective address" but not memory access is involved; this is just a devious way of making an addition).
  3. System call. %eax contains 23, which means that the system call is setuid. That system call uses one argument (the target UID), to be found in %ebx, which conveniently contains 0 at that point (it was set in the first instruction). Note: upon return, registers are unmodified, except for %eax which contains the returned value of the system call, normally 0 (if the call was a success).
  4. Push %ebx on the stack (which is still 0).
  5. Push $0x68732f6e on the stack.
  6. Push $0x69622f2f on the stack. Since the stack grows "down" and since the x86 processors use little endian encoding, the effect of instructions 4 to 6 is that %esp (the stack pointer) now points at a sequence of twelve bytes, of values 2f 2f 62 69 6e 2f 73 68 00 00 00 00 (in hexadecimal). That's the encoding of the "//bin/sh" string (with a terminating zero, and three extra zeros afterwards).
  7. Move %esp to %ebx. Now %ebx contains a pointer to the "//bin/sh" string which was built above.
  8. Push %eax on the stack (%eax is 0 at that point, it is the returned status from setuid).
  9. Push %ebx on the stack (pointer to "//bin/sh"). Instructions 8 and 9 build on the stack an array of two pointers, the first being the pointer to "//bin/sh" and the second a NULL pointer. That array is what the execve system call will use as second argument.
  10. Move %esp to %ecx. Now %ecx points to the array built with instructions 8 and 9.
  11. Sign-extend %eax into %edx:%eax. cltd is the AT&T syntax for what the Intel documentations call cdq. Since %eax is zero at that point, this sets %edx to zero too.
  12. Set %al (the least significant byte of %eax) to 11. Since %eax was zero, the whole value of %eax is now 11.
  13. System call. The value of %eax (11) identifies the system call as execve. execve expects three arguments, in %ebx (pointer to a string naming the file to execute), %ecx (pointer to an array of pointers to strings, which are the program arguments, the first one being a copy of the program name, to be used by the invoked program itself) and %edx (pointer to an array of pointers to strings, which are the environment variables; Linux tolerates that value to be NULL, for an empty environment), respectively.

So the code first calls setuid(0), then calls execve("//bin/sh", x, 0) where x points to an array of two pointers, first one being a pointer to "//bin/sh", while the other is NULL.

This code is quite convoluted because it wants to avoid zeros: when assembled into binary opcodes, the sequence of instruction uses only non-zero bytes. For instance, if the 12th instruction had been movl $0xb,%eax (setting the whole of %eax to 11), then the binary representation of that opcode would have contained three bytes of value 0. The lack of zero makes that sequence usable as the contents of a zero-terminated C string. This is meant for attacking buggy programs through buffer overflows, of course.

like image 116
Thomas Pornin Avatar answered Sep 28 '22 16:09

Thomas Pornin