I'm starting out with 6502 Assembly right now and have a problem wrapping my head around loops that need to deal with numbers bigger than 8 bit.
Specifically, I want to loop through some memory locations. In pseudo-c-code, I want to do this:
// Address is a pointer to memory
int* address = 0x44AD;
for(x = 0; x < 21; x++){
// Move pointer forward 40 bytes
address += 0x28;
// Set memory location to 0x01
&address = 0x01;
}
So starting at address $44AD
I want to write $01
into ram, then jump forward $28
, write $01
into that, then jump forward $28
again until I've done that 20 times (last address to write is $47A5
).
My current approach is loop unrolling which is tedious to write (even though I guess an Assembler can make that simpler):
ldy #$01
// Start from $44AD for the first row,
// then increase by $28 (40 dec) for the next 20
sty $44AD
sty $44D5
sty $44FD
[...snipped..]
sty $477D
sty $47A5
I know about absolute addressing (using the Accumulator instead of the Y register - sta $44AD, x
), but that only gives me a number between 0 and 255. What I really think I want is something like this:
lda #$01
ldx #$14 // 20 Dec
loop: sta $44AD, x * $28
dex
bne loop
Basically, start at the highest address, then loop down. Problem is that $14 * $28 = $320 or 800 dec, which is more than I can actually store in the 8-Bit X register.
Is there an elegant way to do this?
The 6502 is an 8-bit processor, so you aren't going to be able to calculate 16-bit addresses entirely in registers. You will need to indirect through page zero.
// set $00,$01 to $44AD + 20 * $28 = $47CD
LDA #$CD
STA $00
LDA #$47
STA $01
LDX #20 // Loop 20 times
LDY #0
loop: LDA #$01 // the value to store
STA ($00),Y // store A to the address held in $00,$01
// subtract $28 from $00,$01 (16-bit subtraction)
SEC
LDA $00
SBC #$28
STA $00
LDA $01
SBC #0
STA $01
// do it 19 more times
DEX
BNE loop
Alternatively, you could use self-modifying code. This is a dubious technique in general, but common on embedded processors like the 6502 because they are so limited.
// set the instruction at "patch" to "STA $47CD"
LDA #$CD
STA patch+1
LDA #$47
STA patch+2
LDX #20 // Loop 20 times
loop: LDA #$01 // the value to store
patch:STA $FFFF
// subtract $28 from the address in "patch"
SEC
LDA patch+1
SBC #$28
STA patch+1
LDA patch+2
SBC #0
STA patch+2
// do it 19 more times
DEX
BNE loop
More efficient way to copy 1k of data:
ldy #0
nextvalue:
lda address, y
sta address, y
lda address+$100, y
sta address+$100, y
lda address+$200, y
sta address+$200, y
lda address+$300, y
sta address+$300, y
iny
bne nextvalue
Few notes:
Faster, as loop overhead is reduced. Takes more space due to more commands.
If the assembler you use supports macros, you can easily make it configurable, how many blocks the code handles.
Might not be 100% relevant to this, but here's another way to have longer-than-255 loops:
nextblock:
ldy #0
nextvalue:
lda address, y
iny
bne nextvalue
;Insert code to be executed between each block here:
dec numblocks
bpl nextblock
numblocks:
.byte 3
Few notes:
For now, the code doesn't really do anything meaningful, but runs the loop "numblocks" times. "Add your own code" :-) (Often I use this together with some self-modifying code that increments sta, y address for example)
bpl can be dangerous (if you don't know how it works), but works well enough in this case (but wouldn't, if numblocks address contained big enough value)
If you need to execute the same code again, numblocks needs to be re-set.
Code can be made a little bit faster by putting numblocks to zero page.
If not needed for something else (like it often is), you can use X register instead of memory location.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With