Page 1 of 1

Delay N clocks

Posted: Sat Jan 12, 2008 10:36 pm
by blargg
Most of my assembly coding is part of reverse-engineering video game consoles, and a common need is to delay N clocks, where N is a constant or run-time value. I figured I'd share the 6502 routines I use, since I found them fun to write. This is the key routine:

Code: Select all

; Delays A+20 clocks (excluding JSR)
; Preserved: X, Y
delay_a_clocks:
        lsr a
        bcs @b0c        ; 2/3
@b0c:   lsr a
        bcs @b1s        ; 2/3
        lsr a
        bcs @b2s        ; 2/3
@b2c:   bne @ge8        ; 3
                        ; -1
@ret:   rts
@ge8:   sec             ; 2
        sbc #1          ; 2
        beq @ret        ; 3
                        ; -1
        nop             ; 2
@eights:
        bne :+          ; *3
:       sbc #1          ; *2
        bne @eights     ; *3
                        ; -1
        rts
        
@b1s:   lsr a           ; 2
        bcc @b2c        ; 3/2
        nop             ; 2
@b2s:   bcs @b2c        ; 3
It's somewhat obfuscated because I wanted to reduce the overhead (the +20) as much as possible. In addition to the above, there are routines that delay A*256+10 clocks and A*65536+10 clocks, allowing 16-bit and 24-bit run-time delays by simply calling the three routines with the low, mid, and high bytes of the delay. There will be an overhead of some constant number of clocks, but that's usually not a problem.

For constant delays, I use a macro (ca65 assembler) that selects between several strategies depending on the delay, which can be any expression evaluating to 2 to 16777216 (or zero). For delays of less than 28, it uses either an inline delay made up of short instructions, or a JSR to a bunch of NOPs followed by a return. For 28 and larger, it uses a call to delay_a_clocks, and optional calls to variants of the "delay A*256 clocks" and "delay A*65536 clocks" routines when the mid and high bytes are non-zero. The X and Y registers are preserved by all routines/macros, but A is not, since it's easy enough to save and restore it. At one point I had a version with the full 24-bit delay stored inline after the JSR, which shortened the delay calls a bit, but this made the delay code much more complex, so I went back to the simpler scheme I use now.

Here's the full commented source code I use: 6502_delay.asm

Posted: Mon Jan 14, 2008 2:44 am
by dclxvi
I've added two routines I had sitting around to the wiki at:

http://6502org.wikidot.com/software-delay

One routine delays 25+A cycles (includes the JSR and RTS); the other can be inlined (i.e. you don't need a JSR or RTS) and delays 15+A cycles (it would take 27+A cycles if a JSR and RTS were used). They're a little smaller too.

Posted: Mon Feb 20, 2012 2:48 pm
by repose
This does 14+A in the range A=(1,8) or 13 with A=0.

First a reference

Code: Select all

$C5 cmp zp
$C9 cmp #
$EA nop
The Code

Code: Select all

;A=1..8
*=$1000
clc
adc #$ff-8;A=8-A so result will be 7…0 in A
eor #$ff
sta corr+1 ;self-writing code, the bpl jump-address = A
corr bpl *+2 ;the jump to (A) dependent byte (13 cycles so far)
cmp #$c9 ;A=8->A=0->BPL +2
cmp #$c9 ;
cmp #$c9 ;
cmp $ea ;3 =9  (13+9=22 max delay)
A table of the code fragments by branch offset

Code: Select all

Start Address
$1000    $1001    $1002    $1003    $1004    $1005    $1006    $1007    $1008
-------- -------- -------- -------- -------- -------- -------- -------- --------
cmp #$c9 cmp #$c9 cmp #$c9 cmp #$c9 cmp #$c9 cmp #$c5 cmp $ea  nop
cmp #$c9 cmp #$c9 cmp #$c9 cmp #$c5 cmp $ea  nop      
cmp #$c9 cmp #$c5 cmp $ea  nop
cmp $ea  nop
-------- -------- -------- -------- -------- -------- -------- -------- --------
9        8        7        6        5        4        3        2        0
Cycles

Expanding the concept

Code: Select all

Range Size
1..2  12
1..4  14
1..6  16
1..8  18
Simply adjust from cmp $ea to cmp #$c9 cmp $ea etc.

There's a variation for 1..7 that's 15 bytes because you can use eor #7 directly.

Re: Delay N clocks

Posted: Thu Mar 23, 2017 11:54 pm
by repose
Apparently I did some more work on this. See
http://csdb.dk/forums/?roomid=11&topicid=65658

I made a long post in there that's a bit hard to understand, but basically I've worked out every approach to this problem I think.