Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can an instruction be in two addressing modes at the same time?

I have read the following in the book Programming from the Ground Up:

Processors have a number of different ways of accessing data, known as addressing modes. The simplest mode is immediate mode, in which the data to access is embedded in the instruction itself. For example, if we want to initialize a register to 0, instead of giving the computer an address to read the 0 from, we would specify immediate mode, and give it the number 0.

In the register addressing mode, the instruction contains a register to access, rather than a memory location. The rest of the modes will deal with addresses.

Does that mean that for example the instruction mov eax, 123 is in both immediate mode and register addressing mode?

like image 709
user8240761 Avatar asked Jul 01 '17 13:07

user8240761


2 Answers

It's not the whole instruction that has a certain addressing mode, it's each operand separately. In your mov eax, 123 example, you'd say that the source is an immediate operand, and the destination is a register operand.

Or you could say that the machine code for that instruction will use the mov r, imm32 encoding of mov, if you want to talk about the form the whole instruction takes. (There's also a mov r/m, imm32 form of mov, but it's longer so a good assembler will only pick that if the destination actually is memory).

However, when one of the operands is a register, you could for convenience and brevity say "the instruction uses a [base+index] addressing mode" if you want. But really it's the memory operand that you're talking about, not really the whole instruction. Especially if you count register and immediate as "addressing modes", even though there is no memory address involved.


Moreover, usually when people say "addressing mode", they're talking about a memory address. Technically in x86, most instructions have one register and one register/memory operand, so the difference between add eax, ecx and add eax, [ecx] is only I think 1 bit in the mod/rm byte (which follows the opcode).

Some instructions have two memory operands. For example, push qword [rdi + rax*8] explicitly loads from [rdi + rax*8] and implicitly stores to [rsp]. Another example are the string instructions movs and cmps, which use [rdi] and [rsi] implicitly.

But no instruction has two general r/m operands that let you use an arbitrary choice of the normal addressing modes. So an x86 instruction has at most one mod/rm byte.


It's debatable whether an immediate operand should be called an "addressing mode", because the data doesn't come from anywhere. It's part of the instruction. Also, the immediate-operand form of an instruction has a different opcode from the reg, reg/mem form.

Also note that most integer instructions that can have a memory source or memory destination have two opcodes: one for op r/m, r and one for op r, r/m. (e.g., see the ref manual entry for and, and more links to docs in the x86 tag wiki.) Anyway, and eax, ecx can be encoded with either of the two opcodes, and it's up to the assembler to pick. The choice makes no difference for performance.

like image 153
Peter Cordes Avatar answered Oct 17 '22 07:10

Peter Cordes


Processors have a number of different ways of accessing data, known as addressing modes.

This sentence talks about "processors" in general, not about a certain processor type.

I think this is too generalized because you'll always find an exception to sentences like this. Indeed when looking on modern CPUs you'll find more exceptions than CPUs that follow this rule.

Indeed for "simple" CPUs like the 6800 or the 6502 the instruction itself has one addressing mode:

lda $3A

... for example uses "zero-page" or "direct" addressing mode.

Other CPUs indeed definitely had two different addressing modes in one instruction. The "move" instruction of the 68000 for example:

move.w ($123).w, (a3, $4567)

For the x86 CPU it is even more difficult to say:

In the 6800, the instruction that can be compared to mov al, bl is named tba (with no arguments) while mov al, [0x123] was named lda $123.

So you could argue that mov al, bl is an instruction without argument (addressing mode implied - because the instruction is written as movblal without any operands on other CPUs) and mov al, [0x123] is an instruction with one memory address argument (absolute addressing mode - because the instruction is written as ldal 0x123 with one operand on other CPUs).

(The only instructions of the original 8086 not allowing you to argue like this seem to be the instructions having m8, imm8 and m16, imm16 addressing modes such as mov word ptr [123], 567 or add byte ptr [123], 45.)

Of course you may also argue that the instruction is mov and that al and bl are two arguments of the mov al, bl instruction.

So it depends on your argumentation if the instruction mov al, bl is an instruction with the addressing mode "implied" (= no operands) or "register-to-register".

like image 20
Martin Rosenau Avatar answered Oct 17 '22 07:10

Martin Rosenau