While looking through the Gameboy's instruction set, I came across instructions such as:
LD A, A
LD B, B
LD C, C
LD D, D
...
Each of these instructions has it's own opcode in this table, which makes me think they are of some importance due to the restrictions on the number of possible opcodes.
I first thought that it might be dereferencing a pointer in that register and storing the value at that pointer (like in this question), but in an emulator, LD A, A
is implemented as:
Z80._r.a = Z80._r.a
They seem to have no effect on the state of the processor (just set registers to their own value) and take the same number of cycles as a NOP
to execute.
Why are these opcodes included in the instruction set and what purpose do they serve?
In a load-store architecture, all arithmetic operations get their operands from, and produce results in addressable registers. Communication between memories and registers requires separate “load” and “store” operations, which may be scheduled in parallel with arithmetic operations if permitted by the instruction set.
Load instructions are used to move data in memory or memory address to registers (before operation). Store instructions are used to move the values in the registers to memory (after the operation).
Single or multiple registers can be loaded and stored at one time. Load and store single register instructions can transfer a 32-bit word, a 16-bit halfword, or an 8-bit byte between memory and a register.
Explanation: In register instructions addressing mode, operands are stored in the registers R0-R7 of the selected register bank.
The purpose of Instruction Register is to shift in instruction through TDI and having the provision to store the instruction till a new instruction is fully shifted in. Typically an IR has two registers inside it as shown below.
The stack register is a 16-bit or larger register used to address a selected area of RAM known as the stack. This memory is used to store register contents and status information when subroutines and interrupts are used. To execute an instruction, the processor copies the instruction code from the program memory into the instruction register (IR).
Essential Registers for Instruction Execution Last Updated : 03 May, 2020 These are various registers required for execution of instruction : Program Counter (PC), Instruction Register (IR), Memory Buffer (or Data) Register (MBR or MDR), and Memory Address Register (MAR). These are explained as follows below.
Access of the instruction register to the status register allows for conditional branching or conditional subroutines. The instruction register (IR) has the same clock and reset signals as the PC, and also the same interface to the bus (IR_bus) defined as a std_logic_vector of type INOUT.
They simplify the decoding unit, if you will check
7F LD A,A
78 LD A,B
79 LD A,C
vs
47 LD B,A
40 LD B,B
41 LD B,C
vs
4F LD C,A
48 LD C,B
49 LD C,C
You can notice, that the bottom 3 bits are reserved for source register (values 0-7 going B,C,D,E,H,L,(HL),A
), 3 bits next to them are target register, again having the same 0-7 meaning (thus 0 vs 0 creates LD B,B
), and the top two bits 01
select the LD
, not sure from the quick glance if I deciphered it perfectly.
One would also expect then 76
to be LD (HL),(HL)
, which makes even less sense than LD A,A
, so there's special logic to catch that one and do HALT
instead.
So it's about simplicity of instruction decoder, using the same bit patterns to select source/target registers, and about not adding more transistors to catch the same,same
situations, except the (HL),(HL)
(which maybe will internally fail on both source and target requiring memory access, so maybe the extra "logic" is fairly simple in the HW design.
Keep in mind the early CPUs were often hand-designed and the amount of total transistors had to be kept low both to fit on the chip, and to be manageable to draw the circuitry by hand and verify its correctness.
EDIT: The Z80 has about 8500 transistors, you may want to check: https://en.wikipedia.org/wiki/Transistor_count and https://en.wikipedia.org/wiki/Zilog_Z80 ... and GameBoy has a bit modified Z80, but the amount of total transistors will be very close-ish to the original value, although I didn't search for exact value, and I'm not sure how far into the future the Nintendo was extending it, maybe they could afford even going for something like 20-50k already, but I doubt it.
Addendum: lately I have read about the Russian Sinclair ZX Spectrum clones, which were heavily modified machines, adding extra power, memory and capabilities... And some of them are using these ld same,same
opcodes to control DMA transfers, so on these machines code using them as nop
would probably fail to execute properly. This is not GameBoy related, but in case you have binary targetting one of the "Sprinter" or similar Russian ZX clones, and you find one of these in disassembly, don't consider them automatically nop
, they may be part of effective code actually doing something (most probably with DMA).
These curious NOP instructions go all the way back to the original ancestor processor the Intel 8008. In that chip, they were merely an result of the implementation of the register move instruction. Allowing MOV A,A etc simplified the instruction decoder and saved silicon space.
From the 8080 through to the Z80 (and beyond), these became required to maintain backwards compatibility. They even survived into the x86 world in the form
MOV AL,AL etc.
So most modern desktop machines still support these odd instructions.
Note: I used Intel mnemonics when describing Intel machines. Be assured that these assemble down to the same binary code as the Zilog mnemonics.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With