I am working on a school assignment and I am completely stumped. The professor and TA have been of no help as every answer they provide to any student is some variation of "keep looking, the answer is there." I am trying to create a shell using this code:
#include <stdio.h>
#include <stdlib.h>
const char code[] =
"\x31\xc0"
"\x50"
"\x68""//sh"
"\x68""/bin"
"\x89\xe3"
"\x50"
"\x53"
"\x89\xe1"
"\x99"
"\xb0\x0b"
"\xcd\x80"
;
int main(int argc, char **argv)
{
printf("running...\n");
char buf[sizeof(code)];
strcpy(buf, code);
((void(*)( ))buf)( );
}
I have tried to replace code[]
with some other examples found online (including this site) as well as an example from an additional pdf the prof provided. None of these were useful. I used gdb to disassemble and attempted to construct my own code[]
and that too failed. For what it's worth, I can say that in a normal user my application segfaults on the ((void(*)( ))buf)( );
line and just quits (no segfault notice) in a root user on the same line.
I have no idea where else to take this assignment and I can not work on any of the later buffer overflow tasks until I can understand this simple first step. Any help would be greatly appreciated.
EDIT: I forgot to mention, I have tried this on both OSX 10.8.2 and on a Ubuntu VM via VirtualBox. I'm assuming it won't work on OSX, but I was desperate. ha For Ubuntu, we were asked to do:
sudo #sysctl -w kernel.randomize_va_space=0
sudo apt-get install zsh cd/bin sudo rm sh sudo ln -s /bin/zsh /bin/sh
Those commands should disable address space randomization, install zsh and link it to /bin/sh. I completed all of those tasks in the VM with no errors
Your code disassembles to something like this :
00000000 31C0 xor eax,eax
00000002 50 push eax
00000003 682F2F7368 push dword 0x68732f2f
00000008 682F62696E push dword 0x6e69622f
0000000D 89E3 mov ebx,esp
0000000F 50 push eax
00000010 53 push ebx
00000011 89E1 mov ecx,esp
00000013 99 cdq
00000014 B00B mov al,0xb
00000016 CD80 int 0x80
Courtesy of ndisasm
. Let's go through these instructions step by step and analyse the stack frame on the way.
xor eax,eax
zeroes out the eax
register, since the XOR operation of an operand with itself will always yield zero as the result. push eax
then pushes the value on the stack. Therefore, the stack currently looks more or less like this (offsets shown relative to the value of esp
at the start of the code, esp
means the stack cell that esp
currently points to) :
+----------+
0 | 00000000 |
esp -4 | xxxxxxxx |
+----------+
Next, we have two push dword
instructions, which push some immediate value to the stack, which - after executing them - looks like this :
+----------+
0 | 00000000 |
-4 | 68732f2f |
-8 | 6e69622f |
esp -12| xxxxxxxx |
+----------+
esp
currently points at the last byte of the second immediate value that was pushed to the stack. Let's try interpreting the pushed values as ASCII, in the order that they will be read from the stack if we start sequentially from the current value of esp
. We get the byte sequence of 2f62696e2f2f7368
, which in ASCII is equal to /bin//sh
. Plus, the sequence ends with a 0, so it is a valid C-string.
This is the main reason why the current value of esp
is saved into the register ebx
. It contains the path to the executable that will be run. The double slash is not a problem for the OS, since POSIX simply ignores multiple occurrences of slashes and treats them as one slash.
Next, we have the current values of eax
and ebx
pushed into the stack. We know that eax
contains zero, and ebx
contains a pointer to the C-string "/bin//sh"
. The stack currently looks like this :
+----------+
0 | 00000000 |
-4 | 68732f2f |
-8 | 6e69622f |
ebx -12| 00000000 |
-16| (ebxVal) |
ecx esp -20| xxxxxxxx |
+----------+
After pushing the values of the registers to the stack, the current pointer to esp
is saved in ecx
.
cdq
is an instruction that performs a very neat trick in this case : it sign-extends the current value of eax
into the edx:eax
register pair. Therefore, in this case, it zeroes out the value in edx
, since the sign extension of zero is zero. We could, of course, clear the value in edx
with xor edx, edx
, but that instruction is encoded with two bytes - and cdq
only takes up one.
The next instruction puts the value 0xb
(11) into the low byte register of eax
. Similarly as in the previous case, we could just do mov eax, 0xb
, but that would lead to a 5-byte instruction, as immediates must be encoded as full 32-bit values.
int 0x80
calls the system call invoker on Linux. It expects the number of the system call in eax
(which now equals 0xb
, so the sys_execve
function will be called), and additional arguments in ebx
, ecx
, edx
, esi
, edi
, and ebp
.
Now, let's have a look at the prototype for that system call :
int execve(const char *filename, char *const argv[], char *const envp[]);
Therefore, the filename
argument is placed in ebx
- it points to /bin//sh
. argv
, placed in ecx
, is an array of arguments for the executable which is to be executed and must be terminated with a NULL
value. On the Intel architecture, NULL
is equal to 0
, and ecx
points to just that : a pointer to /bin//sh
, and then a NULL
value. envp
, which is NULL
, points to an array of environment values, which must be expressed as char*
values of the form key=value
.
The successful execution of execve
results in the current process image replaced with the image of the pointed-to executable, executed with the arguments provided. In this case, /bin/sh
will be executed (if it exists) with the argument of /bin//sh
.
Michael was probably right as to why this doesn't work : recent Linux kernels mark the data pages as non-executable, and trying to execute them will result in a segmentation fault.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With