GARTHWILSON wrote:
as long as you know the state of the clock pin, and it's on bit 0, you can cycle it without affecting the other bits using just INC<port> and DEC<port>.
GARTHWILSON wrote:
For MISO (the master-in, slave-out data line), it's nice to put it on bit 6 or 7 of a port so you can use the BIT instruction on it
I'll go slightly OT here and expand on Garth's point. It's definitely worth paying attention to which bits are assigned for the various functions!
Yes, by all means attach the SPI clock to bit0 of the port, because INC<port> (or DEC<port>) is usually faster than the instructions it would take to set (or clear) the clock if it were attached to some other bit. As for MISO (which requires testing when the Master is inputting), I recommend attaching it to bit7. Bit6 is special only if you use the BIT instruction -- and it turns out
the BIT instruction can be optimized away! I didn't quite believe it at first, and had to give my head a shake. But I've successfully tested code equivalent to the following. Here's how the port bits are used in the snippets below. Bit1 (MOSI) was chosen arbitrarily; the other two not.
- bit7 of VIAPORT is in input mode -- attaches to MISO
- (bits available for other uses, especially as inputs)
- bit1 of VIAPORT is in output mode -- attaches to MOSI
- bit0 of VIAPORT is in output mode -- attaches to Ck
The snippets are written as if part of a subroutine, but I omitted details managed by the caller or preceding inline code -- the critical portion is the loop. Firstly we have code for inputting a byte from SPI. As noted in the comments, the INC instruction does two things -- it performs
input as well as output.
This makes a BIT instruction unnecessary -- and, in a loop this tight, omitting one instruction means a substantial speedup, percentage-wise. I've marked the cycle counts for the instructions which execute repeatedly. (If your VIA is mapped in zero-page subtract 2 cycles per bit from the times shown.)
Code:
SPIBYTEIN: LDA #1 ;LDA #1 is for counting
INPUTLOOP: 4 STZ VIAPORT_IO ;set Ck=0, mosi=0
6 INC VIAPORT_IO ;set Ck=1 INC DOES 2 THINGS (sets Ck; also updates N flag per MISO)
2/3 BMI MISO_IS_1
2 CLC ; MISO is =0
2 ROL A
3 BCC INPUTLOOP ;more bits?
RTS
MISO_IS_1: 2 SEC ; MISO is =1
2 ROL A
3 BCC INPUTLOOP ;more bits?
RTS
19/20 <----- cycles per bit
Edit: here is an improved version, described in my subsequent post. BTW both of these input routines treat the
output data as don't-care (MOSI is held at 0).
Code:
SPIBYTEIN: LDA #1 ;LDA #1 is for counting
INPUTLOOP: 4 STZ VIAPORT_IO ;set Ck=0, mosi=0
6 INC VIAPORT_IO ;set Ck=1 INC DOES 2 THINGS (sets Ck; also updates N flag per MISO)
2/3 BPL MISO_IS_0
2 SEC ;MISO is =1
2 ROL A
3 BCC INPUTLOOP ;more bits?
RTS
MISO_IS_0: 2 ASL A ;MISO is =0
3 BCC INPUTLOOP ;more bits?
RTS
19/18 <----- cycles per bit
And here (below) is the routine for
outputting a byte
to SPI. These routines are the fastest I've managed to achieve so far, but suggestions are welcome of course. (I guess I could unroll the loops... )
Code:
SPIBYTEOUT: LDY #2 ;Y is used to hold a constant.
SEC ;SEC / ROL A is for counting
ROL A
OUTPUTLOOP: 2/3 BCS MOSI_1
4 STZ VIAPORT_IO ;ck=0, mosi=0 STZ updates both Ck & mosi
6 INC VIAPORT_IO ;ck=1
2 ASL A
3 BNE OUTPUTLOOP ;more bits?
RTS
MOSI_1: 4 STY VIAPORT_IO ;ck=0, mosi=1 STY updates both Ck & mosi
6 INC VIAPORT_IO ;ck=1
2 ASL A
3 BNE OUTPUTLOOP ;more bits?
RTS
17/18 <----- cycles per bit
In another
post I talk about driving a 16is750 UART with this code. Even a 1 MHz 6502 can bit-bang fast enough to achieve 19.2 or even 34.8 kbaud on the asynch connection -- and of course faster CPU's exceed this figure.