I'm following along with the Baking Pi course from Cambridge University, in which a simple operating system is built in the ARMv6 instruction set, targeting the Raspberry Pi.
We've been using two ways of loading data into registers via the ldr
instruction so far and I realize now that I'm using them together, I don't fully understand what they both do.
So I've used things like ldr r0,=0x20200000
, which I actually understood as "read the data stored at the memory location 0x20200000 into register r0.
Then I've used things like:
ldr r0,[r1,#4]
Which I've understood as being "read the data stored at the memory address pointed to by r1, at an offset of 4 bytes, into register r0".
Then I encounter this:
ldr r0,=pattern
ldr r0,[r0]
pattern
here is a .int
in the .data
section (a bitmap representing a sequence of on/off states for an LED). I realize upon reading this, that my previous understanding of =foo
must be wrong, otherwise both of the above instructions would do the same thing.
Is the =x
syntax basically more like a pointer in C, while the [x]
syntax is as if the memory that is being pointed to by x
is actually read?
Let's say ptr
in the C below is an int*
, do my comments thinking about equivalent assembly (conceptually, not literally) make any sense?
r0 = ptr; /* equivalent to: ldr r0,=ptr */
r0 = *ptr; /* equivalent to: ldr r0,[ptr] */
r0 = *(ptr+4) /* equivalent to: ldr r0,[ptr,#4] */
ldr r0,=something
...
something:
means load the address of the label something into the register r0. The assembler then adds a word somewhere in reach of the ldr instruction and replaces it with a
ldr r0,[pc,#offset]
instruction
So this shortcut
ldr r0,=0x12345678
means load 0x12345678 into r0.
being mostly fixed length instructions, you cant load a full 32 bit immediate into a register in one instruction, it can take a number of instructions to completely load a register with a 32 bit number. Depends heavily on the number. For example
ldr r0,=0x00010000
will get replaced by the gnu assembler with a single instruction mov r0,#0x00010000 if it is an ARM instruction, for a thumb instruction though it may still have to be ldr r0,[pc,#offset]
These ldr rd,=things are a shortcut, pseudo instructions, not real.
ldr rd,[rm,#offset]
ldr rd,[rm,rn]
are real instructions and mean read from memory at address rm+offset or rm+rn and take the value read and put it in the register rd
the =something is more like &something in C.
unsigned int something;
unsigned int r0;
unsigned int r1;
r0 = &something;
r1 = *(unsigned int *)r0;
and in assembly
something:
.word 0
ldr r0,=something
ldr r1,[r0]
It is worth mentioning that there exist the following alternative approaches for those that want to avoid pseudo instructions / the literal pool for some reason:
adr r0, label
(v7 / v8): single instruction, stores the full address of the label into r0
. Refers to label by relative PC addressing it, see also: What are the semantics of ADRP and ADRL instructions in ARM assembly? | Example with asserts.
In ARMv7, it is not possible however to access labels in different sections with adr
, e.g. .data
from .text
, apparently because there is no relocation that takes care of it. ldr =
can do this. If you try, GAS fails with:
Error: symbol .data is in a different section
Cross section access is however possible in ARMv8, and generates a relocation of type R_AARCH64_ADR_PRE
. Example.
movw
and movt
(v7) + GNU GAS #:lower
:
movw r0, #:lower16:myvar
movt r0, #:upper16:myvar
Example with asserts.
movk
+ shifts (v8) + GNU GAS :
movz x0, #:abs_g2:label // bits 32-47, overflow check
movk x0, #:abs_g1_nc:label // bits 16-31, no overflow check
movk x0, #:abs_g0_nc:label // bits 0-15, no overflow check
Example with asserts.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With