Right, I think it was implicit in the original spec that the signals would be slow enough to make a software routine viable, otherwise a software routine would not have been requested. For faster signals you could build hardware that performs a similar function, and the corresponding software would just reset it, then watch for its status. Having written a software routine, you can then analyse it to see how fast it runs and whether that's fast enough in practice.
As for detecting ten transitions…
cracks knuckles Code:
port = $xxxx
zp = $zz
prev = zp
trans = zp ; 1-based array of ten
first = trans+1
tenth = trans+10
WaitPortTenTrans:
; initialise
LDA port
STA prev
LDX #10
: STZ trans,X
DEX
BNE :-
; main loop
: LDA port
TAY
EOR prev
STY prev
TAY ; mask of changed bits
LDX #9
: AND trans,X ; bits that have already changed this many times
ORA first,X ; set them in the next condition
STA first,X
TYA ; return to bits just changed
DEX
BNE:-
TSB first ; record first transitions
LDX tenth ; now test to see if all final flags set
INX
BNE :--
RTS
Obviously this is bigger and significantly slower than a routine which detects only two transitions per bit. It suffers slightly from the fact that TSB/TRB do not have indexed addressing modes, so I had to replace one TSB with an ORA/STA pair. I count exactly 200 cycles per sample; at 8MHz that is 25µs, so fmax of the input signals would be 20kHz.
More speed could be obtained by unrolling the inner loop, which would remove 46 cycles of loop overhead and permit using TSB again, at the expense of code size. I think that brings the total down to 120 cycles per loop, so that's significantly faster.
Another alternative approach, which saves ZP space now that there are more transitions than port bits, is to keep an individual counter per bit:
Code:
port = $xxxx
zp = $zz
prev = zp
waiting = zp+1 ; bitmask
trans = zp+1 ; 1-based array of eight
last = trans+8
WaitPortTenTrans:
; initialise
LDA port
STA prev
LDA #$FF
STA waiting
LDA #10
LDX #8
: STA trans,X
DEX
BNE :-
; main loop
: LDA port
TAY
EOR prev
STY prev
LDX #8
: ASL A ; test each bit in turn for a transition
BCC :++
DEC trans,X ; decrement corresponding counter, and check if it reached zero
BNE :++
; somewhat nasty routine to clear corresponding bit in waiting mask
PHX
PHA
LDA #0
SEC
: ROR A
DEX
BNE :-
TRB waiting
PLA
PLX
: DEX ; move on to the next bit
BNE :---
LDX waiting ; still waiting?
BNE :----
RTS
Timing analysis is more difficult here because the loop body is no longer branch-free. But this example can be used to test for an arbitrary number of transitions on all bits, up to the capacity of a byte. A simpler version could simply count the total number of bit transitions.