Alright so I have this line in my assembly
MOV EAX, DWORD PTR DS:[ESI]
where ESI is 00402050
(ascii, "123456789012")
After this instruction: EAX = 34333231
What really happened here? How is this value calculated, and why?
Where could I get some good reference on this kind of thing?
A feature of assembly language is that each line in the source code usually contains a single instruction to the processor, for example MOV EAX,EDX will move the content of the EDX register into the EAX register. Here the "MOV" instruction is called a "mnemonic".
Basically, it means "the size of the target operand is 32 bits", so this will bitwise-AND the 32-bit value at the address computed by taking the contents of the ebp register and subtracting four with 0. Follow this answer to receive notifications.
DWORD defines 'size' of the memory location used for move operation. In you example, you'd be moving 0000000Ah (4 bytes) into memory location ESP+18h. As 0Ah is immediate value its size cannot be determined without using DWORD , WORD , BYTE or other similar qualifier.
The size directives BYTE PTR, WORD PTR, and DWORD PTR serve this purpose, indicating sizes of 1, 2, and 4 bytes respectively.
Registers in square brackets such as [ESI]
are dereferenced pointers. The instruction you quote moves the DWORD
(a 32-bit/4-byte value) in memory location specified by ESI
into register EAX
. In your case, memory location 00402050
, read as a DWORD
, contains 34333231
.
Written in pseudo-C:
DWORD EAX; /* Declaring the registers as we find them in silico */ DWORD ESI; ESI = 0x00402050; /* Set up your initial conditions for ESI */ EAX = *((DWORD *)ESI); /* mov EAX, DWORD PTR [ESI] */ /* ^ ^ ^^^^^^^ */ /* | | | */ /* | | +----------- From "DWORD PTR" we get "DWORD *" in C. */ /* | | */ /* | +----------------- The C dereferencing operator * replaces []. */ /* | */ /* +------------------- The C assignment operator = replaces mov opcode. */
In your case, it is not true that 0x00402050
"equals" the string "1234567890"
-- rather it points to the memory which contains that string.
The value which you obtain, 0x34333231
is comprised from the ASCII values for the digits "1234"
, which are the first four bytes (i.e., the first DWORD
) of the string. They appear in reversed order because the Intel architecture is "little endian" in the byte representation of a DWORD
in memory.
In your example at this time, the mov
instruction is loading ASCII characters as if they were the four bytes of an unsigned long
value, when they are actually a string of single-byte characters.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With