How to generate and run native code dynamically?

Question

I'd like to write a very small proof-of-concept JIT compiler for a toy language processor I've written (purely academic), but I'm having some trouble in the middle-altitudes of design. Conceptually, I'm familiar with how JIT works - you compile bytecode into (machine or assembly?) code to run. At the nuts-and-bolts level however, I'm not quite gripping how you actually go about doing that.

My (very "newb") knee-jerk reaction, since I haven't the first clue where to start, would be to try something like the following:

mmap() a block of memory, setting access to PROT_EXEC
write the native code into the block
store the current registers (stack pointer, et al.) someplace cozy
modify the current registers to point into the native code block in the mapped region
the native code would now get executed by the machine
restore the previous registers

Is that even close to a/the correct algorithm? I've tried perusing different projects that I know have JIT compilers to study (such as V8) but these codebases turn out to be difficult to consume because of their size, and I've little idea where to start looking.

Shelwien · Accepted Answer

Not sure about linux, but this works on x86/windows.
Update: http://codepad.org/sQoF6kR8

#include <stdio.h>
#include <windows.h>

typedef unsigned char byte;

int arg1;
int arg2;
int res1;

typedef void (*pfunc)(void);

union funcptr {
  pfunc x;
  byte* y;
};

int main( void ) {

  byte* buf = (byte*)VirtualAllocEx( GetCurrentProcess(), 0, 1<<16, MEM_COMMIT, PAGE_EXECUTE_READWRITE );

  if( buf==0 ) return 0;

  byte* p = buf;

  *p++ = 0x50; // push eax
  *p++ = 0x52; // push edx

  *p++ = 0xA1; // mov eax, [arg2]
  (int*&)p[0] = &arg2; p+=sizeof(int*);

  *p++ = 0x92; // xchg edx,eax

  *p++ = 0xA1; // mov eax, [arg1]
  (int*&)p[0] = &arg1; p+=sizeof(int*);

  *p++ = 0xF7; *p++ = 0xEA; // imul edx

  *p++ = 0xA3; // mov [res1],eax
  (int*&)p[0] = &res1; p+=sizeof(int*);

  *p++ = 0x5A; // pop edx
  *p++ = 0x58; // pop eax
  *p++ = 0xC3; // ret

  funcptr func;
  func.y = buf;

  arg1 = 123; arg2 = 321; res1 = 0;

  func.x(); // call generated code

  printf( "arg1=%i arg2=%i arg1*arg2=%i func(arg1,arg2)=%i
", arg1,arg2,arg1*arg2,res1 );

}

datenwolf · Answer

Youmay want to have a look at libjit which provides exactly the infrastructure you're looking for:

The libjit library implements just-in-time compilation functionality. Unlike other JITs, this one is designed to be independent of any particular virtual machine bytecode format or language.

http://freshmeat.net/projects/libjit

sstn · Answer

The Android Dalvik JIT compiler might also be worth looking at. It is supposed to be fairly small and lean (not sure if this helps understanding it or makes things more complicated). It targets Linux as well.

If things are getting more serious, looking at LLVM might be a good choice as well.

The function pointer approach suggested by Jeremiah sounds good. You may want to use the caller's stack anyway and there will probably only be a few registers left (on x86) which you need to preserve or not touch. In this case, it is probably easiest if your compiled code (or the entry stub) saves them on the stack before proceeding. In the end, it all boils down to writing an assembler function and interfacing to it from C.

How to generate and run native code dynamically?

Tags:

c++

linux

x86

jit

compiler-construction

Chris Tonkinson

3 Answers

Shelwien

datenwolf

sstn

Recent Activity

Donate For Us

How to generate and run native code dynamically?

Tags:

c++

linux

x86

jit

compiler-construction

Chris Tonkinson

3 Answers

Shelwien

datenwolf

sstn

Related questions

Recent Activity

Donate For Us