I've got to learn assembly and I'm very confused as to what the different registers do/point to.
The register 'ESP' is used to point to the next item on the stack and is referred to as the 'stack pointer'. EBP aka the 'frame pointer' serves as an unchanging reference point for data on the stack. This allows the program to work out how far away something in the stack is from this point.
A frame pointer (the ebp register on intel x86 architectures, rbp on 64-bit architectures) contains the base address of the function's frame. The code to access local variables within a function is generated in terms of offsets to the frame pointer.
EBP points to higher memory address at the bottom of the stack, ESP points to the top of the stack at lower memory location. EIP holds the address of next instruction to be executed.
ESI is the Extended Source Index register, "string" (different from C-string, and I don't mean the type of C-string women wear either) instructions like MOVS use ESI and EDI.
On some architectures, like MIPS, all registers are created equal, and there is really no difference beyond the name of the register (and software conventions). On x86 you can mostly use any registers for general-purpose computing, but some registers are implicitly bound to the instruction set.
Lots of information about special purposes for registers can be found here.
Examples:
eax
, accumulator: many arithmetic instructions implicitly operate on eax
. There are also special shorter EAX-specific encodings for many instructions: add eax, 123456
is 1 byte shorter than add ecx, 123456
, for example. (add eax, imm32
vs. add r/m32, imm32
)ebx
, base: few implicit uses, but xlat
is one that matches the "Base" naming. Still relevant: cmpxchg8b
. Because it's rarely required for anything specific, some 32-bit calling-conventions / ABIs use it as a pointer to the "global offset table" in Position Independent Code (PIC).edx
, data: some arithmetic operations implicitly operate on the 64-bit value in edx
:eax
ecx
, counter used for shift counts, and for rep movs
. Also, the mostly-obsolete loop
instruction implicitly decrements ecx
esi
, source index: some string operations read a string from the memory pointed to by esi
edi
, destination index: some string operations write a string to the memory pointed to by edi
. e.g. rep movsb
copies ECX bytes from [esi]
to [edi]
.ebp
, base pointer: normally used to point to local variables. Used implicitly by leave
.esp
, stack pointer: points to the top of the stack, used implicitly by push
, pop
, call
and ret
The x86 instruction set is a complex beast, really. Many instructions have shorter forms that implicitly use one register or another. Some registers can be used to do certain addressing while others cannot.
The Intel 80386 Programmer's Reference Manual is a irreplaceable resource, it basically tells you everything there is to know about x86 assembly, except for newer extensions and performance on modern hardware.
The PC Assembly (e)book is a great resource for learning assembly.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With