A few loose ends and some additional remarks on this topic...
Have you looked at the timings of your added components to get an idea of how fast the system could run with a WDC 65c02?
Well, it's enjoyable to contemplate, Garth... Assuming the KimKlone's memory chips were also updated, things could be speeded up a
LOT! Timing margins are maximal, as there's a pipeline register at the output of the Control Store; access to the Control Store overlaps with execution of the previous micro-word. One minor issue is that the bipolar PROM that aliases 011 op-codes arriving from memory
isn't pipelined, so in pursuit of overall higher clock speeds it'd make sense to allow a Wait State (for PROM access) in the minority case of an op-code fetch which returns a 011 code.
Odd but I think I like the 65C02 with your extended addressing better then a 65C816.
Thanks, Rick. I have mixed feelings on this point, myself. Since my last post I've spent some time mulling over the '816 data sheet, and have become sufficiently familiar with the chip to comment.
It's safe to say that KK memory addressing is immensely better than run-of-the-mill MMU arrangements which force code and data spaces to coexist within 64K. The KK is also superior to
the MOS 6509, even though both feature what an MMU lacks: the all-important ability to switch between full, undivided 64K banks on a bus-cycle by bus-cycle basis. Naturally the '816 also has this ability, but for various reasons the '816 seems a more capable device, overall, than the KimKlone.
65xx coding revolves around indirect pointers in Zero/Direct Page, and of course for 16 MByte addressing the pointers use an extra byte. The '816 features a couple of address modes (
Long Indirect and
Long Indirect Post-Indexed-Y) which employ three-byte pointers very efficiently (ie: directly from Zero-page), whereas KK suffers a handicap: somewhere upstream of a Long memory access there needs to be a separate instruction, albeit a speedy one, to fetch the most-significant byte of the pointer and place it in a register.
On the other hand,
it's the '816 which is at a disadvantage if you happen to need a Long address mode other than those provided. Consider, for example, a Forth implementation which uses the X register to maintain the Parameter Stack. Presumably you want to be able to accept Long addresses on the stack, but when it comes time to code the Long @ and ! words it becomes uncomfortably apparent that the '816 lacks the
Long Pre-Indexed-X Indirect mode you require. The workaround is a two-step operation that begins by updating the Data Bank Register -- more or less as KK does! The difference is that KK has a good selection of instructions for doing this, whereas the only way to update the 816's DBR is with a Pull instruction -- a very clumsy maneuver in this case.
(Because of this issue, if I were writing an '816 Forth I would lean heavily toward using SP, not X, as the P-Stack pointer. Another advantage of using SP would be the applicability of Stack Relative Indirect Indexed mode, a boon directly analogous to
KimKlone's "X-Indirect-Y" mode. This is a mode that features pre-indexing, indirection
and post-indexing.)
On another point, I admire the '816's ability to manage indexed address calculations that result in Bank crossings -- in other words, actual 24-bit addition, built right into the address mode. (KK must explicitly compute such addresses up front, either in Forth or m/c code.) However, I find the '816 data sheet somewhat ambiguous as to which address modes accommodate Bank crossings and which merely wrap around in the low-order 16 bits. (Is there a better reference document available online? What about a simulator? I don't have an '816 to run tests on!)
That sums up KK versus the '816. Briefly I'll mention a third point of comparison, one that I learned about right here on 6502.org. In the
Hardware -> 6502 with 3-byte addressing topic, BigEd mentions a design which "... uses page 03 as extra indirection bytes, so that indirect and indirect,Y opcodes take an extra cycle to fetch a byte from &0301 + zp_offset to yield a 24-bit address..." I like this idea, and would probably take a similar tack if I ever had to design KK over again. As BigEd points out,
The nice thing about extended pointers is that you can have lots of them: one or two bank registers isn't so convenient.