I've been thinking for a while about the following assembly code (NASM IA-32):
ORG 0xFF000 ; This is (1MB - 4KB) 0x100000 - 0x1000=0xFF000.
USE16 ;produce 16bit code
code_size EQU (end -init_16) ; calculates code length
times (4096-code_size) db 0x90 ; fills the rest of the memory with NOP's
init_16:
cli ;disables interrupts (not really necessary, just an example)
jmp init_16 ;infinite loop
align 16
end :
It's just an example. The idea is that we have an IA-32 processor in real mode. And on the top 4Kbyte of the memory we have an NVRAM (non volatile RAM). The reset vector points to 0xFFF0, so the code tries to place the cli
instruction in the 0xFFFF0 address independently of the amount of instructions placed between the init16
label and the align 16
directive (limited to 16 bytes so it can fit in to the 1Mbyte). But I can't understand how it does it.
I'm particularly troubled with the align 16
and times
directives. Because they seem to depend on the result of the other so I don't know how NASM solves this.
First, we have the times
directive that needs the result of the align 16
directive. times
needs to know how many bytes did align 16
add in order to change the code_size
label and fill the rest of the memory with NOP's
.
We also have the align
directive that needs to know what was the result of the times
directive in order to know where did the jmp
instruction ended up and then calculate how many NOP's
it has to add to get to the new 16bit aligned position.
So it seems to me that both directives depend on the result of the other.
Furthermore, I can't figure why the cli
instruction always ends up in the 0xFFFF0 addres independently if add instructions between the cli
and jump
. It is the objective, but I don't know how it works.
I think that both directives make an undetermined system so there are many different solutions. For example in the code I presented before I think a solution could be:
The cli
instruction ends up in 0xFFFF1
the jump
instruction in 0xFFFF2
and the align 16
fills the addresses 0xFFFF2 to 0xFFFFF with NOP's
So the code size
label is now defined and the times
directive fills the addresses 0x0000 to 0xFFFF0 with NOP's
Why this is not the behavior of the code?
Firstly, I find it strange to see ORG 0xFF000
together with USE16
. It think in real address mode, ORG
is meant to be a 16-bit offset in a 64KB segment.
Because on the first pass, the assembler does not yet know about the end and init_16 labels, it could just skip the times
that depends on it. This would leave the current offset ($) at ORG
. Then come the 3 bytes from encoding cli
and the short jump jmp init_16
, followed by the 13 bytes produced by align 16
.
At this point, both labels are known and a following pass can start using these offsets. code_size is calculated to be 16 (the difference between both labels) and so times
fills with 4080 nop
s (4096-16).
Although the 2 labels have now moved up in memory by 4080 bytes, their difference is still the same (16) and so no further passes are needed. The code is resolved.
Furthermore, I can't figure why the cli instruction always ends up in the 0xFFFF0 addres independently if add instructions between the cli and jump. It is the objective, but I don't know how it works
Adding a few instructions right after this cli
does not change the procedure that was outlined for as long as the difference between both labels stays 16. You could insert instructions worth 13 bytes.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With