First I am sorry about the length of this post, but I wanted to explain the problem clearly.
I try to write a kind of small self modifying program in C but I have some troubles and I don't know exactly why.
Plateform is : Ubuntu/Linux 2.6.32-40 x86_64, prog is build on x86 arch, gcc (Ubuntu 4.4.3-4ubuntu5.1) 4.4.3, GNU ld (GNU Binutils for Ubuntu) 2.20.1-system.20100303
The purpose of the program is to create a read/write/execute chunk of memory (with memalign(3) and mprotect(2)), copy a small function called p()
(defined in the .text
segment) in this chunk of memory and then execute the copied function through a pointer. The p()
function just displays a message using printf(puts)
.
In order to get the starting and ending address of the code of p()
(to copy it), I use the address of the function itself and the address of a dummy()
function create just after p()
in .text
.
int p() { ... } <- address where the copy starts
int dummy() { ... } <- address where the copy stops
Chunk memory creation and copy are successfully done but when the code in the chunk is run a segfault occurs. By using gdb
it is clear that we enter in the code of the chunk (the body of the copied function) but the call to printf failed. When disassembling the p()
function and the code in the chunk I see that the address use in the 'call' is not the same.
And I don't known why the address is incorrect, when the code is copied it is displayed and it is the same that objdump (or gdb) gave me when I disassemble the p()
function.
The binary is create with -static
to avoid potential problem with got/plt
or with the relocation process of ld.so
. It doesn't seem to be a problem to run code on the heap
because the beginning of the copied function is executed (check under gdb
).
The simplified src of the program :
<... skip include/checks ...>
#define newline() putchar('\n')
/* - function copied in the chunk */
int p()
{
printf("hello world\n");
return 0;
}
/* - dummy function to get last address of p() */
int dummy() { return 0; }
int main()
{
char *s, *code, *chunk;
unsigned int pagesz, sz;
int (*ptr)(void);
pagesz = sysconf(_SC_PAGE_SIZE);
chunk = (char*)memalign(pagesz, 4 * pagesz);
mprotect(chunk, 4 * pagesz, PROT_WRITE|PROT_EXEC|PROT_READ);
/* - get size, display addr */
sz = (char *)dummy - (char *)p;
printf("Copy between : %p - %p\n", (char *)p, (char *)dummy);
printf("Copy in chunk : %p - %p\n", chunk, chunk + sz, sz);
/* - copy code (1 byte in supp) */
printf("Copied code : ");
for(s = (char *)p, code = chunk; \
s <= (char *)dummy; s++, code++) {
*code = *s;
/* - write in console -- to check if code of p() after disas
* it with objdump(1) is the same, RESULT : ok */
printf("%02x ", *(unsigned char *)code);
}
newline();
/* - run once orginal function */
ptr = p;
ptr();
/* - run copied function (in chunk) */
ptr = (int (*)(void))chunk;
ptr();
newline();
free(chunk);
return 0;
}
The p()
function disassembled by objdump(1)
:
080483c3 <p>:
80482c0: 55 push %ebp
80482c1: 89 e5 mov %esp,%ebp
80482c3: 83 ec 18 sub $0x18,%esp
80482c6: c7 04 24 a8 d9 0a 08 movl $0x80ad9a8,(%esp)
80482cd: e8 7e 0c 00 00 call 8048f50 <_IO_puts>
80482d2: b8 00 00 00 00 mov $0x0,%eax
80482d7: c9 leave
80482d8: c3 ret
080483dc <dummy>:
....
When the program is run under gdb(1) the copied code is the same (hex value) than objdump(1) provide above :
# gcc -m32 -o selfmodif_light selfmodif_light.c -static -g -O0
# gdb -q ./selfmodif_light
Reading symbols from /path/.../selfmodif_light...done.
(gdb) list 55
50 /* - run once orginal function */
51 ptr = p;
52 ptr();
53
54 /* - run copied function (in chunk) */
55 ptr = (int (*)(void))chunk;
<<< The problem is here >>>
56 ptr();
57
58 newline();
59 free(chunk);
(gdb) br 56
Breakpoint 1 at 0x8048413: file tmp.c, line 56.
(gdb) run
Starting program: /path/.../selfmodif_light
Copy between : 0x80482c0 - 0x80482d9
Copy in chunk : 0x80d2000 - 0x80d2019
Copied code : 55 89 e5 83 ec 18 c7 04 24 a8 d9 0a 08 e8 7e 0c 00 00 b8 00 00 00 00 c9 c3 55
hello world
Breakpoint 1, main () at tmp.c:56
56 ptr();
If we look in main we next go into the chunk :
(gdb) disas main
Dump of assembler code for function main:
0x080482e3 <+0>: push %ebp
... <skip> ...
=> 0x08048413 <+304>: mov 0x18(%esp),%eax
0x08048417 <+308>: call *%eax
... <skip> ...
0x08048437 <+340>: ret
End of assembler dump.
But when p() and chunk are disassembled, we have a call 0x80d2c90
in the memory chunk instead of a call 0x8048f50 <puts>
like in the p() function ? For which reason displayed address is not the same.
(gdb) disas p
Dump of assembler code for function p:
0x080482c0 <+0>: push %ebp
0x080482c1 <+1>: mov %esp,%ebp
0x080482c3 <+3>: sub $0x18,%esp
0x080482c6 <+6>: movl $0x80ad9a8,(%esp)
0x080482cd <+13>: call 0x8048f50 <puts> <<= it is not the same address
0x080482d2 <+18>: mov $0x0,%eax
0x080482d7 <+23>: leave
0x080482d8 <+24>: ret
End of assembler dump.
(gdb) disas 0x80d2000,0x80d2019
Dump of assembler code from 0x80d2000 to 0x80d2019:
0x080d2000: push %ebp
0x080d2001: mov %esp,%ebp
0x080d2003: sub $0x18,%esp
0x080d2006: movl $0x80ad9a8,(%esp)
0x080d200d: call 0x80d2c90 <<= than here (but it should be ??)
0x080d2012: mov $0x0,%eax
0x080d2017: leave
0x080d2018: ret
End of assembler dump.
When memory is checked, codes seem to be identical. At this point I don't understand what is happening, what is the problem ? gdb's interpretation failed, copy of code or what ?
(gdb) x/25bx p // code of p in .text
0x80482c0 <p>: 0x55 0x89 0xe5 0x83 0xec 0x18 0xc7 0x04
0x80482c8 <p+8>: 0x24 0xa8 0xd9 0x0a 0x08 0xe8 0x7e 0x0c
0x80482d0 <p+16>: 0x00 0x00 0xb8 0x00 0x00 0x00 0x00 0xc9
0x80482d8 <p+24>: 0xc3
(gdb) x/25bx 0x80d2000 // code of copy in the chunk
0x80d2000: 0x55 0x89 0xe5 0x83 0xec 0x18 0xc7 0x04
0x80d2008: 0x24 0xa8 0xd9 0x0a 0x08 0xe8 0x7e 0x0c
0x80d2010: 0x00 0x00 0xb8 0x00 0x00 0x00 0x00 0xc9
0x80d2018: 0xc3
If a breakpoint is set then execution continue in the memory chunk :
(gdb) br *0x080d200d
Breakpoint 2 at 0x80d200d
(gdb) cont
Continuing.
Breakpoint 2, 0x080d200d in ?? ()
(gdb) disas 0x80d2000,0x80d2019
Dump of assembler code from 0x80d2000 to 0x80d2019:
0x080d2000: push %ebp
0x080d2001: mov %esp,%ebp
0x080d2003: sub $0x18,%esp
0x080d2006: movl $0x80ad9a8,(%esp)
=> 0x080d200d: call 0x80d2c90
0x080d2012: mov $0x0,%eax
0x080d2017: leave
0x080d2018: ret
End of assembler dump.
(gdb) info reg eip
eip 0x80d200d 0x80d200d
(gdb) nexti
0x080d2c90 in ?? ()
(gdb) info reg eip
eip 0x80d2c90 0x80d2c90
(gdb) bt
#0 0x080d2c90 in ?? ()
#1 0x08048419 in main () at selfmodif_light.c:56
So at this point either the program runs like that and a segfault occurs or $eip is changed and the program ends without errors.
(gdb) set $eip = 0x8048f50
(gdb) cont
Continuing.
hello world
Program exited normally.
(gdb)
I do not understand what is happening, what failed. The copy of the code seems to be ok, the jump into the memory chunk too, so why the address (of the call) isn't the good ?
Thanks for your answers and your time
80482cd: e8 7e 0c 00 00 call 8048f50
That's a relative CALL (to +0xC7E). When you move that instruction to a different EIP, you need to modify the offset.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With