This might be an exact duplicate of Is it possible to execute 32-bit code in 64-bit process by doing mode-switching?, but that question is from a year ago and only has one answer that doesn't give any source code. I'm hoping for more detailed answers.
I'm running 64-bit Linux (Ubuntu 12.04, if it matters). Here's some code that allocates a page, writes some 64-bit code into it, and executes that code.
#include <assert.h>
#include <malloc.h>
#include <stdio.h>
#include <sys/mman.h> // mprotect
#include <unistd.h> // sysconf
unsigned char test_function[] = { 0xC3 }; // RET
int main()
{
int pagesize = sysconf(_SC_PAGE_SIZE);
unsigned char *buffer = memalign(pagesize, pagesize);
void (*func)() = (void (*)())buffer;
memcpy(buffer, test_function, sizeof test_function);
// func(); // will segfault
mprotect(buffer, pagesize, PROT_EXEC);
func(); // works fine
}
Now, purely for entertainment value, I'd like to do the same thing but with buffer
containing arbitrary 32-bit (ia32) code, instead of 64-bit code. This page implies that you can execute 32-bit code on a 64-bit processor by entering "long compatibility sub-mode", by setting the bits of the CS segment descriptor as LMA=1, L=0, D=1
. I am willing to wrap my 32-bit code in a prologue/epilogue that performs this setup.
But can I do this setup, in Linux, in usermode? (BSD/Darwin answers will also be accepted.) This is where I start to get really hazy on the concepts. I think the solution involves adding a new segment descriptor to the GDT (or is it the LDT?), and then switching to that segment via an lcall
instruction. But can all that be done in usermode?
Here's a sample function that should return 4 when successfully run in compatibility sub-mode, and 8 when run in long mode. My goal is to get the instruction pointer to take this codepath and come out the other side with %rax=4
, without ever dropping into kernel mode (or doing so only via documented system calls).
unsigned char behave_differently_depending_on_processor_mode[] = {
0x89, 0xE0, // movl %esp, %eax
0x56, // push %{e,r}si
0x29, 0xE0, // subl %esp, %eax
0x5E, // pop %{e,r}si
0xC3 // ret
};
Yes, you can. It's even doable using fully supported interfaces. Use modify_ldt to install a 32-bit code segment into the LDT, then set up a far pointer to your 32-bit code, then do an indirect jump to it using ljumpl *(%eax)
in AT&T notation.
You'll face all kinds of snafus, though. The high bits of your stack pointer are likely to get destroyed. You probably need a data segment if you actually want to run real code. And you'll need to do another far jump to get back to 64-bit mode.
A fully worked-out example is in my linux-clock-tests in test_vsyscall.cc
. (It's a little broken on any released kernel: int cc
will crash. You should change that to something else more clever, like "nop". Look in intcc32
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With