Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does the ARM PC register point to the instruction after the next one to be executed?

Tags:

assembly

arm

According to the ARM IC.

In ARM state, the value of the PC is the address of the current instruction plus 8 bytes.

In Thumb state:

  • For B, BL, CBNZ, and CBZ instructions, the value of the PC is the address of the current instruction plus 4 bytes.
  • For all other instructions that use labels, the value of the PC is the address of the current instruction plus 4 bytes, with bit[1] of the result cleared to 0 to make it word-aligned.

Simply saying, the value of the PC register points to the instruction after the next instruction. This is the thing I don't get. Usually (particularly on the x86) program counter register is used to point to the address of the next instruction to be executed.

So, what are the premises underlying that? Conditional execution, maybe?

like image 482
newbie Avatar asked Jun 06 '14 22:06

newbie


People also ask

Which register in ARM is used to point to the location of next executing instruction in a program?

The instruction pointer, IP, is also often referred to as the program counter. This register contains the memory address of the next instruction to be executed.

Which register in ARM work as PC?

R15: PC (Program Counter). The Program Counter is automatically incremented by the size of the instruction executed. This size is always 4 bytes in ARM state and 2 bytes in THUMB mode.

What is the PC register in assembly?

The program counter (PC), commonly called the instruction pointer (IP) in Intel x86 and Itanium microprocessors, and sometimes called the instruction address register (IAR), the instruction counter, or just part of the instruction sequencer, is a processor register that indicates where a computer is in its program ...

What is the significance of link register in a ARM processor?

A link register is a special-purpose register which holds the address to return to when a function call completes. This is more efficient than the more traditional scheme of storing return addresses on a call stack, sometimes called a machine stack.


1 Answers

It's a nasty bit of legacy abstraction leakage.

The original ARM design had a 3-stage pipeline (fetch-decode-execute). To simplify the design they chose to have the PC read as the value currently on the instruction fetch address lines, rather than that of the currently executing instruction from 2 cycles ago. Since most PC-relative addresses are calculated at link time, it's easier to have the assembler/linker compensate for that 2-instruction offset than to design all the logic to 'correct' the PC register.

Of course, that's all firmly on the "things that made sense 30 years ago" pile. Now imagine what it takes to keep a meaningful value in that register on today's 15+ stage, multiple-issue, out-of-order pipelines, and you might appreciate why it's hard to find a CPU designer these days who thinks exposing the PC as a register is a good idea.

Still, on the upside, at least it's not quite as horrible as delay slots. Instead, contrary to what you suppose, having every instruction execute conditionally was really just another optimisation around that prefetch offset. Rather than always having to take pipeline flush delays when branching around conditional code (or still executing whatever's left in the pipe like a crazy person), you can avoid very short branches entirely; the pipeline stays busy, and the decoded instructions can just execute as NOPs when the flags don't match*. Again, these days we have effective branch predictors and it ends up being more of a hindrance than a help, but for 1985 it was cool.

* "...the instruction set with the most NOPs on the planet."

like image 60
Notlikethat Avatar answered Oct 22 '22 03:10

Notlikethat