Code: Select all
; Delays A+20 clocks (excluding JSR)
; Preserved: X, Y
delay_a_clocks:
lsr a
bcs @b0c ; 2/3
@b0c: lsr a
bcs @b1s ; 2/3
lsr a
bcs @b2s ; 2/3
@b2c: bne @ge8 ; 3
; -1
@ret: rts
@ge8: sec ; 2
sbc #1 ; 2
beq @ret ; 3
; -1
nop ; 2
@eights:
bne :+ ; *3
: sbc #1 ; *2
bne @eights ; *3
; -1
rts
@b1s: lsr a ; 2
bcc @b2c ; 3/2
nop ; 2
@b2s: bcs @b2c ; 3For constant delays, I use a macro (ca65 assembler) that selects between several strategies depending on the delay, which can be any expression evaluating to 2 to 16777216 (or zero). For delays of less than 28, it uses either an inline delay made up of short instructions, or a JSR to a bunch of NOPs followed by a return. For 28 and larger, it uses a call to delay_a_clocks, and optional calls to variants of the "delay A*256 clocks" and "delay A*65536 clocks" routines when the mid and high bytes are non-zero. The X and Y registers are preserved by all routines/macros, but A is not, since it's easy enough to save and restore it. At one point I had a version with the full 24-bit delay stored inline after the JSR, which shortened the delay calls a bit, but this made the delay code much more complex, so I went back to the simpler scheme I use now.
Here's the full commented source code I use: 6502_delay.asm