I am looking at an Intel-x86 program trace and came across this instruction
REP MOVS BYTE PTR ES:[EDI],BYTE PTR DS:
I know that
REP MOVS
causes the MOV instruction to be run a number of times specified by the value in the ECX register, which is 0x2b in my case.
I know that
BYTE PTR
is determining the size of information, in this case just a byte.
I know that
ES:[EDI]
is telling to move whatever is in BYTE PTR DS: to the address pointed at by EDI.
What I do not know is what the part after the comma does.
BYTE PTR DS:
Questions:
Why does the PTR instruction do? Why not just
REP MOVS BYTE ES:[EDI]. BYTE DS:
What is ES and DS corresponding to?
Thanks
REP MOVS DWORD PTR ES:[EDI], DWORD PTR [ESI]
is a synonym for REP MOVSD
and
REP MOVS BYTE PTR ES:[EDI], BYTE PTR[ESI]
is a synonym of
REP MOVSB
You can write this way in an attempt to "improve" the readability of the code. The idea might have been the following: "if somebody forgot that MOVSB moves from ESI do EDI, this longer syntax will help to make things clearer". This in no way affects the compiled binary form. The difference is only in the textual source code.
As you know, there are the following MOVS commands, based on data sizes:
The MOVS command copies data from DS:SI(ESI/RSI) to ES:DI(EDI/RDI) -- the size of SI/DI register is based on your current mode - 16-bit, 32-bit or 64-bit. It also increases (decreases) SI and DI registers (based on the D flag, set CLD to increase the registers).
The MOVS command cannot use other registers than DS:SI/ES:DI, so it is unnecessary to specify them. In my opinion, it is even redundant to set them, and the readability doesn't improve but worsens.
The DS and ES are "segment" registers. As I wrote before, the MOVS only operates with SI/DI as index registers and DS/ES as segment registers. You cannot modify the registers with which the MOVS command works.
But you should not worry about the segment registers because they are usually already set up correctly, and you should not modify or be concerned about them if your program runs under a standard OS like Linux, Windows, etc. These segment registers may be needed only in the following cases:
In 16-bit mode, on Intel CPUs from 8086 to 80286, there were the following segment registers: CS DS ES SS.
In the real mode, the 16-bit segment register is interpreted as the most significant 16 bits of a linear 20-bit address (so the CPU did essentially multiply the value of the segment register by 16 to get the base address of the segment). For example, if you move 1 to DS, and you move 2 to SI, the "byte ptr DS:[SI]" will mean 1*16+2 = 18 (18th byte from the start of the memory space).
In protected mode (80286 and on) the segment registers no longer held 16-bit integer values. They now contain an index into a table of segment descriptors containing the 24-bit base address.
In the Intel 80386 and later, 32-bit protected mode retains the segmentation mechanism of 80286 protected mode. Still, a paging unit has been added as a second layer of address translation between the segmentation unit and the physical bus. Also, the segment base in each segment descriptor is also 32-bit (instead of 24-bit). Also, two new segment registers were added: FS and GS.
The 64-bit architecture does not use segmentation. Four of the segment registers: CS, SS, DS, and ES are forced to 0, and the limit to 264. The segment registers FS and GS can still have a nonzero base address. This allows operating systems to use these segments for particular purposes.
A notable fact is that the 386 and later Intel x86 CPUs still use 16-bit size segment registers because they merely hold an index of the segment descriptor table.
Since, as I wrote before, in a standard operating system, be it 32-bit or 64-bit, segment registers are DS and ES registers already pre-configured and point to the same memory, you can just ignore them.
You can find more information in Chapter 7.3.9.1 "String Instructions" of the Intel® 64 and IA-32 Architectures Software Developer's Manual (Combined Volumes: 1, 2A, 2B, 2C, 2D, 3A, 3B, 3C, 3D and 4). Quote:
These instructions operate on individual elements in a string, which can be a byte, word, or doubleword. The string elements to be operated on are identified with the ESI (source string element) and EDI (destination string element) registers. Both of these registers contain absolute addresses (offsets into a segment) that point to a string element. By default, the ESI register addresses the segment identified with the DS segment register. A segment-override prefix allows the ESI register to be associated with the CS, SS, ES, FS, or GS segment register. The EDI register addresses the segment identified with the ES segment register; no segment override is allowed for the EDI register. The use of two different segment registers in the string instructions permits operations to be performed on strings located in different segments. Or by associating the ESI register with the ES segment register, both the source and destination strings can be located in the same segment. (This latter condition can also be achieved by loading the DS and ES segment registers with the same segment selector and allowing the ESI register to default to the DS register.) The MOVS instruction moves the string element addressed by the ESI register to the location addressed by the EDI register. The assembler recognizes three "short forms" of this instruction, which specify the size of the string to be moved: MOVSB (move byte string), MOVSW (move word string), and MOVSD (move doubleword string).
Since the first Pentium CPU was produced in 1993, Intel began to made simple commands faster and complex commands (like REP MOVS) slower.
So, REP MOVS became very slow, and there was no more practical reason to use it.
In 2013, Intel decided to revisit REP MOVS. If the CPU (produced after 2013) has CPUID ERMSB (Enhanced REP MOVSB) bit, then REP MOVSB and REP STOSB commands are executed differently than on older processors, and are supposed to be fast. In practice, it is only fast for large blocks, 256 bytes and larger, and only when certain conditions are met:
See the Intel Manual on Optimization, section 3.7.6 Enhanced REP MOVSB and STOSB operation (ERMSB) http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-optimization-manual.pdf
They are very slow on small blocks because of very high startup cost – about 35 cycles.
It seems like the instruction doesn't end there. I just came across with this instruction in OllyDBG today, and i can resize the instruction column to reveal the rest of the instruction.
00499B3A |. F3:A4 |REP MOVS BYTE PTR ES:[EDI],BYTE PTR DS:[ESI]
Now that we know the full instruction is, but I still don't know what this instruction does. So I pulled out the Intel instruction set reference manual from here and searched for the opcode F3:A4
In the manual, it describes this opcode as follows:
Move (E)CX bytes from DS:[(E)SI] to ES:[(E)DI].
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With