I'm trying to call inlined machine code from pure Python code on Linux. To this end, I embed the code in a bytes literal
code = b"\x55\x89\xe5\x5d\xc3"
and then call mprotect()
via ctypes
to allow execution of the page containing the code. Finally, I try to use ctypes
to call the code. Here is my full code:
#!/usr/bin/python3
from ctypes import *
# Initialise ctypes prototype for mprotect().
# According to the manpage:
# int mprotect(const void *addr, size_t len, int prot);
libc = CDLL("libc.so.6")
mprotect = libc.mprotect
mprotect.restype = c_int
mprotect.argtypes = [c_void_p, c_size_t, c_int]
# PROT_xxxx constants
# Output of gcc -E -dM -x c /usr/include/sys/mman.h | grep PROT_
# #define PROT_NONE 0x0
# #define PROT_READ 0x1
# #define PROT_WRITE 0x2
# #define PROT_EXEC 0x4
# #define PROT_GROWSDOWN 0x01000000
# #define PROT_GROWSUP 0x02000000
PROT_NONE = 0x0
PROT_READ = 0x1
PROT_WRITE = 0x2
PROT_EXEC = 0x4
# Machine code of an empty C function, generated with gcc
# Disassembly:
# 55 push %ebp
# 89 e5 mov %esp,%ebp
# 5d pop %ebp
# c3 ret
code = b"\x55\x89\xe5\x5d\xc3"
# Get the address of the code
addr = addressof(c_char_p(code))
# Get the start of the page containing the code and set the permissions
pagesize = 0x1000
pagestart = addr & ~(pagesize - 1)
if mprotect(pagestart, pagesize, PROT_READ|PROT_WRITE|PROT_EXEC):
raise RuntimeError("Failed to set permissions using mprotect()")
# Generate ctypes function object from code
functype = CFUNCTYPE(None)
f = functype(addr)
# Call the function
print("Calling f()")
f()
This code segfaults on the last line.
Why do I get a segfault? The mprotect()
call signals success, so I should be permitted to execute code in the page.
Is there a way to fix the code? Can I actually call the machine code in pure Python and inside the current process?
(Some further remarks: I'm not really trying to achieve a goal -- I'm trying to understand how things work. I also tried to use 2*pagesize
instead of pagesize
in the mprotect()
call to rule out the case that my 5 bytes of code fall on a page boundary -- which should be impossible anyway. I used Python 3.1.3 for testing. My machine is an 32-bit i386 box. I know one possible solution would be to create a ELF shared object from pure Python code and load it via ctypes
, but that's not the answer I'm looking for :)
Edit: The following C version of the code is working fine:
#include <sys/mman.h>
char code[] = "\x55\x89\xe5\x5d\xc3";
const int pagesize = 0x1000;
int main()
{
mprotect((int)code & ~(pagesize - 1), pagesize,
PROT_READ|PROT_WRITE|PROT_EXEC);
((void(*)())code)();
}
Edit 2: I found the error in my code. The line
addr = addressof(c_char_p(code))
first creates a ctypes char*
pointing to the beginning of the bytes
instance code
. addressof()
applied to this pointer does not return the address this pointer is pointing to, but rather the address of the pointer itself.
The simplest way I managed to figure out to actually get the address of the beginning of the code is
addr = addressof(cast(c_char_p(code), POINTER(c_char)).contents)
Hints for a simpler solution would be appreciated :)
Fixing this line makes the above code "work" (meaning it does nothing instead of segfaulting...).
I did a quick debug on this and it turns out the pointer to the code
is
not being correctly constructed, and somewhere internally ctypes is munging
things up before passing the function pointer to ffi_call()
which invokes the
code.
Here is the line in ffi_call_unix64()
(I'm on 64-bit) where the function pointer is saved
into %r11
:
57 movq %r8, %r11 /* Save a copy of the target fn.
When I execute your code, here is the value loaded into %r11
just before
it attempts the call:
(gdb) x/5b $r11
0x7ffff7f186d0: -108 24 -122 0 0
Here is the fix to construct the pointer and call the function:
raw = b"\x55\x89\xe5\x5d\xc3"
code = create_string_buffer(raw)
addr = addressof(code)
Now when I run it I see the correct bytes at that address, and the function executes fine:
(gdb) x/5b $r11
0x7ffff7f186d0: 0x55 0x89 0xe5 0x5d 0xc3
You might have to flush the instruction cache.
It is unclear (to me, anyway) whether mprotect() automatically does this.
[update]
Of course, had I read the documentation for cacheflush(), I would have seen that it only applies on MIPS (according to the man page).
Assuming this is x86, you might have to invoke the WBINVD (or CLFLUSH) instruction.
In general, self-modifying code needs to flush the i-cache, but as far as I can tell there is no remotely portable way to do so.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With