After many false starts and a lot of learning and trawling here (starting from
viewtopic.php?t=1674), I finally managed to get the VIA shift-out (phi2 clock) talking to my SD card using SPI mode 0 without bit-banging!
I thought I'd share here in case it's useful for others, or if anyone has ideas for improvements. My current code runs at 22 cycles per byte, but I think it could go
slightly faster (question below). I guess the theoretical max is 16 cycles/byte since VIA clocks out bits at half the phi2 clock rate, but not sure if
you could actually code the read loop that tightly?
The schematic sketches my hardware setup, with a timing diagram showing my current understanding. I originally thought I only needed half a phi2 cycle delay on CB1, but as I learned from my scope that doesn't work because of how CB2 lags CB1. Note that RCLK in my h/w is just wired directly to phi2, not what's shown in the timing diagram (again see below). I learned that the pullup resistors are important (I'm surprised they're not built in to the breakout board?) and that an SD card draws a lot of current - had to up my bench cutoff point.
Basically the VIA SR writes a byte out to the SD card while the SD card sends a byte in to a shift register mapped to VIA port A, both clocked by the delayed and inverted CB1 signal. It took a while for me to understand that SPI doesn't really have a separate notion of read vs write - it always just exchanges a byte each way with a pre-agreed notion of which side of the swap is currently meaningful (I guess in principle useful data could actually be exchanged in both directions concurrently?).
When the VIA is "reading" by convention it writes #$ff to trigger the exchange and looks at the byte arriving at port A; when it's "writing" it writes the actual data byte and ignores the incoming data. My block read is shown below (full code on
github). The inner page read section runs in 22 cycles per byte.
I've currently got the external shift register RCLK wired to phi2, making it update continuously at port A. This means I need to be careful to read the incoming data after it arrives but before I trigger the next incoming byte (I leave 18 cycles between, plus the LDA time gives 22 cycles).
But if RCLK instead triggered only once every eight bits, then the byte would stay latched at PORT A even while the next byte was arriving. Then I could safely trigger the next byte exchange *before* reading the value of the previous one, in a tighter loop of 16 to 18 cycles total (I'm still fuzzy about the timing between writing the VIA SR and the CB1 clock starting).
Is there some clever way to do that? It seems like you'd often want an 8-bit shift register to behave that way (wait for all eight bits to arrive, then latch to the outputs). I guess a 3 (or 4) bit counter could count falling SCK edges and generate a rising edge just after the last bit arrives, as shown in the last line of the timing diagram?
Not sure I actually care enough about saving the extra cycles to fight with it, but curious if it's feasible?
Next planning to hook up the
ProDOS kernel and try booting from the SD card. Fun, fun...
Code:
; now read 512 bytes of data
; unroll first loop step to interpose indexing stuff between write/write
ldx #$ff
bit sd_cmd0 ; set overflow as page 0 indicator (all cmd bytes have bit 6 set)
stx VIA_SR ; 4 cycles trigger first byte in
DELAY12
ldy #0 ; 2 cycles byte counter
@next: lda DVC_DATA ; 4 cycles
stx VIA_SR ; 4 cycles trigger next byte
sta (sd_bufp),y ; 6 cycles
DELAY3 ; 3 cycles
iny ; 2 cycles
bne @next ; 2(+1) cycles
inc sd_bufp+1
bvc @crc ; second page?
clv ; clear overflow for second page
bra @next