32-bit successor to 6502There have been many attempts to design both direct and “spiritual” successors to the legendary 6502 processor. In most cases these designs fall into two broad categories:
A. Designs that simply widen the existing registers to 32-bit with some even retaining an 8-bit wide bus (WDC proposal for a 65832 is an example of this design). These designs rarely add more than one or two registers to the overall design and adhere fairly closely to original accumulator-memory design.
B. Expanded designs with more general purpose registers, but having a broadly similar instruction set with familiar mnemonics and additional instructions. These designs tend to end up looking like a “CISC-lite” design with clever instruction encoding and only a few instruction lengths.
Within these two categories you will find that the proposals differ greatly on the following points:
• Instruction width and encoding
• Number of additional registers, if any
• Number of new addressing modes, if any
• New instructions, if any
• Fixed purpose vs. general purpose registers
• External bus width
• Addressing modes
• Width of the registers
• Level of backwards compatibility, if any
• Byte addressable or Word addressable
• Pipelining, if any
• Caches, if any
Floating point and memory management are also two topics that vary between designs, but many designs leave them out entirely, or it is stated that these features can be added later.
Variations in design are also driven by a number of factors:
• Designer preference
• Target language for primary users (Forth, C, Assembly, etc.)
• General purpose computing vs. embedded computing
• Linux Support
• Multitasking support
• Target cost point
• Optimize for low latency/context switching/interrupts
With so much to consider, it is no wonder there are so many divergent ideas on how to move forward with 65XX.
My OpinionIn my opinion, approaches that adhere more closely to the original accumulator-memory design of the 6502 are more interesting and potentially more applicable to the embedded market.
Adding a bunch of general purpose registers adds size and complexity (dual, triple port, etc.) to the register file. Turning the 6502 into a clever, but efficient CISC architecture feels less interesting to me and would lump the 6502 in with other efficient CISC architectures. Why build another Motorola Coldfire?
I know folks love their registers, and their C-compilers, but it is not for me.
But what if we could design a 65XX architecture that was easy to target with C, but still adhered to a 5 32-bit register design?
My Design (from 1000 feet) – YA326502A……………………………….. 32-bit Accumulator
X……………………………….. 32-bit Index/Data Register
Y……………………………….. 32-bit Index/Data Register
SP0, SP1, SP2, SP3……........ 32-bit (4 x 8-bit stack pointers)
PC……………………………... 32-bit Program Counter
SR……………………………… 8-bit Status Register
There are also 4 “fast page” areas, byte addressable that take up the first kilobyte of memory (4 x 256 bytes).
The stacks are byte addressable, so the 4 stacks take up the second kilobyte of memory (4 x 256 bytes).
Byte addressing is very important to the embedded market and makes string handling easier.
Instructions are all 16-bits long plus an operand (16, 24, 32, or 48 bits long in total). Single operands or offsets only. If you run a 16 or 32 bit operation on the stack or "fast page" it operates on the word or longword starting at the byte referenced.
****
Why not have a single, flat 32-bit stack? First of all, this is boring. Secondly, the stack operations will be slow without a lot of cache, because you have to go out to main memory. I would like to run this at a few hundred Mhz. My design has 1K of stack, which can be trivially included on-die. Same with the 4 “fast pages.”
Having 4 small stacks also allows you to easily implement threaded languages. Chuck Moore would think 256-deep stacks are luxurious
In fact if I were to add instructions, I might include some that work on stack pairs explicitly as data and return stacks.
Having 4 “fast pages” makes implementing a C compiler much easier.
Having 4 small stacks and 4 “fast pages” also makes “small multitasking” easy to implement, allowing you to run a couple of tasks concurrently without having to swap in and out of memory.
Finally, I can envision the processor also having what I would call a “Fast interrupt” mode. In this mode only 1 stack and 1 “fast page” are exposed to the programmer. In the event of an interrupt that requires a context switch, the other stacks and “fast pages” would allow you to go three levels deep on context switching without going out to main memory.
If a wide memory bus (64/128bits) were implemented in the 2K of on-die stack/fast page memory, and you reserved some fast page to save the registers, you could switch context very, very quickly.
****
Of course an MMU could be added later which could isolate stacks between kernel and user space programs. It could also remap the 4 stacks and "fast pages" anywhere in memory, but then they aren’t as fast anymore. I really dislike it when the 65XX starts to look too much like just another “large system” processor.
The MMU could also prevent one program from overwriting the stack and fast page of another program, or decide when programs can share a fast page, etc. I think in this regard a simple MMU might be very useful. Sharing access to a direct page can be a great way to pass data between programs.
****
I haven’t decided on how to tell the processor to switch stacks and fast pages. I could have addressing modes for each stack and fast page. With 16 bits of instructions there are more than enough opcodes for this approach and a half-decent assembler with simple mnemonics would make it bone simple. This is what I am leaning toward.
Another approach would be to flip some bits in the status register, but I don’t like adding more state to the processor.
****
What do folks think, would this be an interesting design to pursue?
I think I’ve cooked up something that would be fun to program in assembly, easy to implement C and Forth, and would still fit very well into the embedded market where 65XX currently lives.