In the homework for day one of Xeno Kovah's Introduction to x86 Assembly hosted on OpenSecurityTraining, he assigns,
Instructions we now know(24)
NOP PUSH/POP CALL/RET MOV/LEA ADD/SUB JMP/Jcc CMP/TEST AND/OR/XOR/NOT SHR/SHL IMUL/DIV REP STOS, REP MOV LEAVE
Write a program to find an instruction we havenʼt covered, and report the instruction tomorrow.
He further predicates the assignment on,
SAL
/SAR
MUL
/IDIV
variants of IMUL
/DIV
also don't countRather than objdump
ing random executable and auditing them then creating the source, is it possible to find the list of x86 assembly instructions that GCC currently outputs?
The foundation for this question seems to be that there is a very small subset of instructions actually used that one needs to know to reverse engineer (which is the focus of the course). Xeno seems to be trying to find a fun instructive way to make that point,
I think that knowing about 20-30 (not counting variations) is good enough that you will have the check the manual very infrequently
While I welcome everyone to join me in this awesome class at OpenSecurityTraining, the question is about my proposed method of figuring it out from GCC (if possible). Not, for people to actually do Xeno's assignment. ;)
The foundation for this question seems to be that there is a very small subset of instructions actually used that one needs to know to reverse engineer
Yes, that's generally true. There are some instructions gcc will never emit, like enter
(because it's much slower than push rbp
/ mov rbp, rsp
/ sub rsp, some_constant
on modern CPUs).
Other old / obscure stuff like xlat
and loop
will also be unused because they aren't faster, and gcc's -Os
doesn't go all-out optimizing for size without caring about performance. (clang -Oz
is more aggressive, but IDK if anyone's bothered to teach it about the loop
instruction.)
And of course gcc will never emit privileged instructions like wrmsr
. There are intrinsics (__builtin_...
functions) for some unprivileged instructions like rdtsc
or cpuid
which aren't "normal".
is it possible to find the list of x86 assembly instructions that GCC currently outputs?
This would be the gcc machine-definition files. GCC as a portable compiler has it's own text-based language for machine-definition files which describe the instruction-set to the compiler. (What each instruction does, what addressing modes it can use, and some kind of "cost" the optimizer can minimize.)
See the gcc-internals documentation for them.
The other approach to this question would be to look at an x86 instruction reference manual (e.g. this HTML extract, and see other links in the x86 tag wiki) and look for ones you haven't seen yet. Then write a function where gcc would find it useful.
e.g. if you haven't seen movsx
(sign extension) yet, then write
long long foo(int x) { return x; }
and gcc -O3 will emit (from the Godbolt compiler explorer)
movsx rax, edi
ret
Or to get cdqe
(aka cltq
in AT&T syntax) for sign-extension within rax
, force gcc to do math before sign extending, so it can produce the result in eax
first (with a copy-and-add lea
).
long long bar(unsigned x) { return (int)(x+1); }
lea eax, [rdi+1]
cdqe
ret
# clang chooses inc edi / movsxd rax, edi
See also Matt Godbolt's CppCon2017 talk: “What Has My Compiler Done for Me Lately? Unbolting the Compiler's Lid”, and How to remove "noise" from GCC/clang assembly output?.
Getting gcc to emit rotate instructions is interesting. Best practices for circular shift (rotate) operations in C++. You write it as shifts/OR that gcc can recognize as a rotate.
Because C doesn't provide standard functions for lots of things modern CPUs can do (rotate, popcnt, count leading / trailing zeros), the only portable thing is to write an equivalent function and have the compiler to recognize that pattern. gcc and clang can optimize a whole loop into a single popcnt
instruction when compiling with -mpopcnt
(enabled by -march=haswell
, for example), if you're lucky. If not, you get a stupid slow loop. The reliable non-portable way is to use __builtin_popcount()
, which compiles to a popcnt
instruction if the target supports it, otherwise a table lookup. _mm_popcnt_u64
is popcnt
or nothing: it doesn't compile if the target doesn't support the instruction.
Of course the catch 22 flaw with this approach is that it only works if you already know the x86 instruction set and when any given instruction is the right choice for an optimizing compiler!
(And what gcc chooses to do, e.g. inline string compares to rep cmpsb
in some cases for short strings, although I'm not sure this is optimal. Only rep movs
/ rep stos
have "fast strings" support on modern CPUs. But I don't think gcc will ever use lods
, or any of the "string" instructions without a rep
prefix.)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With