I need a function which can calculate the length of an x86-64 instruction.
For example, it would be usable like so:
char ret[] = { 0xc3 };
size_t length = instructionLength(ret);
length
would be set to 1 in this example.
I do not want to include an entire disassembly library, since the only information I require is the length of the instruction.
I am looking for a minimalist approach, written in C, and ideally as small as possible.
100% complete x86-64 instruction set is not strictly necessary (very obscure ones such as vector register set instructions can be omitted).
A similar answer to what I am looking for (but for the wrong architecture):
Get size of assembly instructions
x86 instructions can be anywhere between 1 and 15 bytes long. The length is defined separately for each instruction, depending on the available modes of operation of the instruction, the number of required operands and more.
al. states that the current x86-64 design “contains 981 unique mnemonics and a total of 3,684 instruction variants” [2].
The x86 architecture contains eight 32-bit General Purpose Registers (GPRs). These registers are mainly used to perform address calculations, arithmetic and logical calculations. Four of the GPRs can be treated as a 32-bit quantity, a 16-bit quantity or as two 8-bit quantities.
x86-64 (also known as x64, x86_64, AMD64, and Intel 64) is a 64-bit version of the x86 instruction set, first released in 1999. It introduced two new modes of operation, 64-bit mode and compatibility mode, along with a new 4-level paging mode.
There is XED library from Intel to work with x86/x86_64 instructions: https://github.com/intelxed/xed, and it is the only correct way to work with intel machine codes.
xed_decode
function will provide you all information about instruction: https://intelxed.github.io/ref-manual/group__DEC.html
https://intelxed.github.io/ref-manual/group__DEC.html#ga9a27c2bb97caf98a6024567b261d0652
And xed_ild_decode
is for instruction length decoding:
https://intelxed.github.io/ref-manual/group__DEC.html#ga4bef6152f61997a47c4e0fe4327a3254
XED_DLL_EXPORT xed_error_enum_t xed_ild_decode ( xed_decoded_inst_t * xedd, const xed_uint8_t * itext, const unsigned int bytes )
This function just does instruction length decoding.
It does not return a fully decoded instruction.
Parameters
- xedd the decoded instruction of type xed_decoded_inst_t . Mode/state sent in via xedd; See the xed_state_t .
- itext the pointer to the array of instruction text bytes
- bytes the length of the itext input array. 1 to 15 bytes, anything more is ignored.
Returns:
xed_error_enum_t indiciating success (XED_ERROR_NONE) or failure. Only two failure codes are valid for this function: XED_ERROR_BUFFER_TOO_SHORT and XED_ERROR_GENERAL_ERROR. In general this function cannot tell if the instruction is valid or not. For valid instructions, XED can figure out if enough bytes were provided to decode the instruction. If not enough were provided, XED returns XED_ERROR_BUFFER_TOO_SHORT. From this function, the XED_ERROR_GENERAL_ERROR is an indication that XED could not decode the instruction's length because the instruction was so invalid that even its length may across implmentations.
To get length from xedd
filled by xed_ild_decode
, use xed_decoded_inst_get_length
: https://intelxed.github.io/ref-manual/group__DEC.html#gad1051f7b86c94d5670f684a6ea79fcdf
static XED_INLINE xed_uint_t xed_decoded_inst_get_length ( const xed_decoded_inst_t * p )
Return the length of the decoded instruction in bytes.
Example code ("Apache License, Version 2.0", by Intel 2016): https://github.com/intelxed/xed/blob/master/examples/xed-ex-ild.c
#include "xed/xed-interface.h"
#include <stdio.h>
int main()
{
xed_bool_t long_mode = 1;
xed_decoded_inst_t xedd;
xed_state_t dstate;
unsigned char itext[15] = { 0xf2, 0x2e, 0x4f, 0x0F, 0x85, 0x99,
0x00, 0x00, 0x00 };
xed_tables_init(); // one time per process
if (long_mode)
dstate.mmode=XED_MACHINE_MODE_LONG_64;
else
dstate.mmode=XED_MACHINE_MODE_LEGACY_32;
xed_decoded_inst_zero_set_mode(&xedd, &dstate);
xed_ild_decode(&xedd, itext, XED_MAX_INSTRUCTION_BYTES);
printf("length = %u\n",xed_decoded_inst_get_length(&xedd));
return 0;
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With