Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Understanding how EIP (RIP) register works?

I'm a complete novice to computer architecture and the low level stuff that happens at the processor/memory level. I'll start by saying that. What i've done with computers has pretty much always been at the high level programming level. C++, Java, etc.

That being said, I'm currently reading a book that is starting to delve into the low level programming stuff, assembly, registers, pointers, etc. I'm having a hard time understanding how the EIP register works.

From what is said in the book, each memory address has one byte, and each byte has a memory address.

From what I'm reading about the EIP register, it points to the next set of instructions for the processor to do. While using debugging tools (GDB) to follow along in the book, if you were to examine memory at a particular location, say:

x/8xb it allegedly lets you examine the first 8 bytes at the memory address. But if each memory address has only 1 byte, I don't understand. Can someone help me understand this? I have looked for thorough explanations of how this register works and functions but I can't really find anything

like image 759
Aldmeri Avatar asked Dec 11 '14 17:12

Aldmeri


Video Answer


1 Answers

Let's begin with a concrete, x86-specific example.

00000000000020b0 <foo>:
    20b0: 89 d1                         movl    %edx, %ecx
    20b2: 89 f8                         movl    %edi, %eax
    20b4: 0f af c6                      imull   %esi, %eax
    20b7: 31 d2                         xorl    %edx, %edx
    20b9: f7 f1                         divl    %ecx
    20bb: c3                            ret

For the sake of simplicity and of this example, think of memory as a gigantic "array" of bytes, and think of memory addresses as indexes into such an "array"0. When something lies at a given memory address, that essentially means that its first byte lies at that address, its second byte (if it's more than just a byte big) lies at the next address, and so on. For example, foo starts at address 0x20B01 and spans 12 bytes. This means that each address from 0x20B0 to 0x20BB (inclusive) points to a byte in the function.

The program counter in x86, called RIP (EIP if 32-bit), points to the next instruction2. For example, if the current instruction being executed is the one at 0x20B2, RIP would contain the value 0x20B4. Because of the CISC nature of x86, instruction sizes differ, so RIP does not necessarily increase by a fixed amount each time.

00000000000020b0 <foo>:
    20b0: 89 d1                         movl    %edx, %ecx
EX->20b2: 89 f8                         movl    %edi, %eax
PC->20b4: 0f af c6                      imull   %esi, %eax
    20b7: 31 d2                         xorl    %edx, %edx
    20b9: f7 f1                         divl    %ecx
    20bb: c3                            ret

In the next "iteration", EX (not a real register, just a way to mark what's being executed) will point to the imull instruction, and PC (RIP) will point to the xorl instruction, and so on until the ret instruction, at which point the return address, which is stored on the stack, will be loaded into RIP so that execution may continue at the caller of foo.


0 As mentioned by Peter Cordes, there are some architectures out there where this does not apply. For the sake of the question, this answer is specific to x86.
1 That's not actually the address at which this function would be found at runtime, but pretend it is for the sake of example.
2 There are some architectures where the program counter points to the current instruction (AArch64 does this), or even to two instructions ahead (AArch32 does this).

like image 125
Mona the Monad Avatar answered Sep 19 '22 20:09

Mona the Monad