I am trying to create an assembler which is able to encode instructions at runtime (for a JIT compiler). Sorry for the long code snippet, but this is the shortest compilable example which shows my problem.
#include <stdint.h>
#include <iostream>
#include <windows.h>
typedef void (*function)();
uint8_t* instructionBuffer;
uint32_t pos;
/**
* Creates the instruction buffer;
*/
void assembler_initialize() {
instructionBuffer = (uint8_t*) VirtualAllocEx(GetCurrentProcess(), 0, 1024,
MEM_COMMIT, PAGE_EXECUTE_READWRITE);
pos = 0;
}
/**
* Writes a call to the given address to the instruction buffer
*/
void assembler_emit_call(uint32_t value) {
// CALL opcode
instructionBuffer[pos++] = 0xFF;
// opcode extension 2, read a 32bit address
instructionBuffer[pos++] = 0x15;
// Address as little endian
instructionBuffer[pos++] = (value >> 0) & 0xFF;
instructionBuffer[pos++] = (value >> 8) & 0xFF;
instructionBuffer[pos++] = (value >> 16) & 0xFF;
instructionBuffer[pos++] = (value >> 24) & 0xFF;
}
/**
* Writes a RET to the instruction buffer
*/
void assembler_emit_ret() {
instructionBuffer[pos++] = 0xC3;
}
/**
* The function to call
*/
void __cdecl myFunction() {
std::cout << "Hello world!" << std::endl;
}
/**
*
*/
int main(int argc, char **argv) {
assembler_initialize();
assembler_emit_call((uint32_t) &myFunction);
assembler_emit_ret();
// Output the address
std::cout << std::hex << (uint32_t) &myFunction << std::endl;
// Output the opcodes
for (uint32_t i = 0; i < 100; i++) {
std::cout << std::hex << (uint32_t) instructionBuffer[i] << " ";
}
std::cout << std::endl;
// Call the function
function f = (function) instructionBuffer;
f();
return 0;
}
The output tells me, that the address of myFunction
is 0x4017c5
, and that these opcodes were written:
CALL ModRM Addr (le) RET Zeros
ff 15 c5 17 40 0 c3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...
Still, my program crashes when trying to execute the code. Did I miss something when encoding the CALL
instruction?
It doesn't work because the call instruction is incorrect. Actually there is no CALL absolute_address
instruction on the x86.
In your example you generate following X86 code:
FF 15 xx xx xx xx
which is an indirect call to the adress xx xx xx xx. This will take the address found at xx xx xx xx and do the call there.
Example
FF 15 10 20 30 00
This will look at the adress 0x302010 :
00302010: 11 22 33 00 xx xx xx xx
where it finds the value 0x00332211 and the calls the function at that address.
With the following modifications in assembler_emit_call
the program works fine.
void assembler_emit_call(uint32_t value) {
// CALL opcode
instructionBuffer[pos++] = 0xb8; // mov eax, address
// Address as little endian
instructionBuffer[pos++] = (value >> 0) & 0xFF;
instructionBuffer[pos++] = (value >> 8) & 0xFF;
instructionBuffer[pos++] = (value >> 16) & 0xFF;
instructionBuffer[pos++] = (value >> 24) & 0xFF;
instructionBuffer[pos++] = 0xff ; // call eax
instructionBuffer[pos++] = 0xd0 ;
instructionBuffer[pos++] = 0xc3 ; // ret
}
BTW
instructionBuffer[pos++] = (value >> 0) & 0xFF;
instructionBuffer[pos++] = (value >> 8) & 0xFF;
instructionBuffer[pos++] = (value >> 16) & 0xFF;
instructionBuffer[pos++] = (value >> 24) & 0xFF;
can be replaced by
*(DWORD*)(instructionBuffer + pos) = value ;
pos += 4 ;
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With