Ultra-fast output port using 65C02 illegal instructions
Posted: Fri Oct 07, 2011 9:04 am
Here's a cool way to provide a 65c02 system with an ultra-fast fast, 5-bit output port.
There are some limitations -- beginning with the fact it only works for 65C02 -- but the speed utterly buries any conventional approach! The total program overhead to write to a port is a single cycle -- twice as fast as the purported 2-cycle minimum for a 65xx instruction. With a 14 MHz system doing 14 mega-operations per second you could be very brisk indeed when talking to SPI, to name just one example!
The key lies in the use of "illegal" (actually just NOP) opcodes in the _3 and _B columns of the opcode map.
As drawn below, the circuit responds to the one-cycle NOP's in both the _3 and _B columns of the opcode map. The lowest three bits of these opcodes always equal 011. And the high 5 bits are "Don't Care"... to the cpu, but not to us!
All 32 of the _3 and _B column opcodes are single-cycle NOPs (except only 30 for WDC; see note below). These 1-byte, 1-cycle NOP's produce no dead cycles and are free of bizarre effects like those seen for undefined opcodes on NMOS. When a Col_3 or Col_B NOP appears in the instruction stream, the 'C02 will instantly swallow it and proceed to fetch another opcode on the immediately subsequent cycle. They truly are one-cycle NOPs, and all we have to do is...
Notice that this is not memory- mapped IO; no address space or address decoding is used.
One note regarding the 5-bit port:
Notes pertaining to the 5-bit port:
-- Jeff
There are some limitations -- beginning with the fact it only works for 65C02 -- but the speed utterly buries any conventional approach! The total program overhead to write to a port is a single cycle -- twice as fast as the purported 2-cycle minimum for a 65xx instruction. With a 14 MHz system doing 14 mega-operations per second you could be very brisk indeed when talking to SPI, to name just one example!
As drawn below, the circuit responds to the one-cycle NOP's in both the _3 and _B columns of the opcode map. The lowest three bits of these opcodes always equal 011. And the high 5 bits are "Don't Care"... to the cpu, but not to us!
All 32 of the _3 and _B column opcodes are single-cycle NOPs (except only 30 for WDC; see note below). These 1-byte, 1-cycle NOP's produce no dead cycles and are free of bizarre effects like those seen for undefined opcodes on NMOS. When a Col_3 or Col_B NOP appears in the instruction stream, the 'C02 will instantly swallow it and proceed to fetch another opcode on the immediately subsequent cycle. They truly are one-cycle NOPs, and all we have to do is...
- detect them, and
- capture the Don'tCare bits for our own use.
Notice that this is not memory- mapped IO; no address space or address decoding is used.
One note regarding the 5-bit port:
- unlike Rockwell's 'C02, the WDC W65C02S has 2 legitimate opcodes in the _B column: $CB (WAI) and $DB (STP). When using a WDC chip you'll probably wanna avoid executing those. And you may choose to avoid all of Column B, aiming for Column 3 only. This yields a 4-bit (not 5-bit) output port.
Notes pertaining to the 5-bit port:
- the value that's output doesn't come from a CPU register. That's OK if you know in advance (ie; at assembly time) what value to output. But if the values are computed at run-time then you'll need to use self-modifying code or some sort of Case construct.
- WDC details all the 65C02 NOPs, including multi-cycle NOPs and the one-cycle NOPs used here, in Table 7-1 of their 'C02 data sheet. From experiments I performed, I know all NOPs on the Rockwell 'C02 have the same behavior as WDC's, and it seems a virtual certainty that 'C02s from Synertek and other manufacturers are the same.
- Interrupts will not be recognized while a one-cycle NOP (or a string of such NOP's) is executing. Interrupt are recognized on the first non-one-cycle instruction that follows. More info here.
- For a fast port scheme that allows input as well as output, visit Garth's circuit potpourri page, here.
-- Jeff