Question

I'm starting out with 6502 Assembly right now and have a problem wrapping my head around loops that need to deal with numbers bigger than 8 bit.

Specifically, I want to loop through some memory locations. In pseudo-c-code, I want to do this:

    // Address is a pointer to memory
    int* address = 0x44AD;
    for(x = 0; x < 21; x++){
        // Move pointer forward 40 bytes
        address += 0x28;
        // Set memory location to 0x01
        &address = 0x01;
    }

So starting at address $44AD I want to write $01 into ram, then jump forward $28, write $01 into that, then jump forward $28 again until I've done that 20 times (last address to write is $47A5).

My current approach is loop unrolling which is tedious to write (even though I guess an Assembler can make that simpler):

ldy #$01
// Start from $44AD for the first row, 
    // then increase by $28 (40 dec) for the next 20
sty $44AD
sty $44D5
sty $44FD
    [...snipped..]
sty $477D
sty $47A5

I know about absolute addressing (using the Accumulator instead of the Y register - sta $44AD, x), but that only gives me a number between 0 and 255. What I really think I want is something like this:

       lda #$01
       ldx #$14 // 20 Dec
loop:  sta $44AD, x * $28
       dex
       bne loop

Basically, start at the highest address, then loop down. Problem is that $14 * $28 = $320 or 800 dec, which is more than I can actually store in the 8-Bit X register.

Is there an elegant way to do this?

Was it helpful?

Solution

The 6502 is an 8-bit processor, so you aren't going to be able to calculate 16-bit addresses entirely in registers. You will need to indirect through page zero.

      // set $00,$01 to $44AD + 20 * $28 = $47CD
      LDA #$CD
      STA $00
      LDA #$47
      STA $01

      LDX #20  // Loop 20 times
      LDY #0
loop: LDA #$01 // the value to store
      STA ($00),Y // store A to the address held in $00,$01
      // subtract $28 from $00,$01 (16-bit subtraction)
      SEC
      LDA $00
      SBC #$28
      STA $00
      LDA $01
      SBC #0
      STA $01
      // do it 19 more times
      DEX
      BNE loop

Alternatively, you could use self-modifying code. This is a dubious technique in general, but common on embedded processors like the 6502 because they are so limited.

      // set the instruction at "patch" to "STA $47CD"
      LDA #$CD
      STA patch+1
      LDA #$47
      STA patch+2

      LDX #20  // Loop 20 times
loop: LDA #$01 // the value to store
patch:STA $FFFF
      // subtract $28 from the address in "patch"
      SEC
      LDA patch+1
      SBC #$28
      STA patch+1
      LDA patch+2
      SBC #0
      STA patch+2
      // do it 19 more times
      DEX
      BNE loop

OTHER TIPS

More efficient way to copy 1k of data:

    ldy #0
nextvalue:
    lda address, y
    sta address, y

    lda address+$100, y
    sta address+$100, y

    lda address+$200, y
    sta address+$200, y

    lda address+$300, y
    sta address+$300, y
    iny
    bne nextvalue 

Few notes:

  • Faster, as loop overhead is reduced. Takes more space due to more commands.

  • If the assembler you use supports macros, you can easily make it configurable, how many blocks the code handles.

Might not be 100% relevant to this, but here's another way to have longer-than-255 loops:

nextblock:
    ldy #0
nextvalue:
    lda address, y
    iny
    bne nextvalue

;Insert code to be executed between each block here:

    dec numblocks
    bpl nextblock

numblocks:
    .byte 3

Few notes:

  • For now, the code doesn't really do anything meaningful, but runs the loop "numblocks" times. "Add your own code" :-) (Often I use this together with some self-modifying code that increments sta, y address for example)

  • bpl can be dangerous (if you don't know how it works), but works well enough in this case (but wouldn't, if numblocks address contained big enough value)

  • If you need to execute the same code again, numblocks needs to be re-set.

  • Code can be made a little bit faster by putting numblocks to zero page.

  • If not needed for something else (like it often is), you can use X register instead of memory location.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top