Lately I've begun playing with a CPU simulator found here:
http://www.cs.colby.edu/djskrien/CPUSim/You can create microcode fragments that can be used to create higher-level instructions. There's a simple built-in assembler to let you test those instructions, plus a debugger that lets you see why your microcode is not behaving as you expect. The hardest thing for me to get used to is the simulator's insistence on numbering bits in a register left-to-right from zero to however many bits it has. I'm used to the bit zero being on the right, not the left.
I've been toying with a model of a 16-bit address space CPU with all 16-bit registers. Sort of the like the 65Org32, except shrunk down considerably in memory space.
Some observations so far:
- TXS, LDA absolute,X can get any value off the stack whereever it is in memory
- TXS, LDA (indirect,X) can get any value pointed to by a stack entry wherever it is in memory, ie., (indirect,X) actually becomes useful
- once you have the microcode for (indirect,X) and (indirect),Y, it's not that hard to cobble together (indirect,X),Y, and with TXS, it's possible to use any stack entry as a pointer for (indirect),Y. This is like the 65816's (indirect,S),Y with an extra TXS but also re-use of existing microcode
- there's no need for zero page or zero page,X addressing, so I replaced those with (indirect) (ie., non-indexed indirect) and (indirect,X),Y. Still only eight address modes
- I haven't implemented it yet, but I'm wondering if LEA (load effective program counter relative address in accumulator) might be more useful than the 65816's PER (push effective program counter relative address on stack). The other two 65816 instructions - PEA and PEI - seem at this point like shortcuts to avoid using the accumulator to load an immediate address and the contents of a memory address, respectively (although perhaps the fact that the pushes are always 16 bits and the registers are not always 16 bits might have something to do with those decisions - something that would not matter in the case of the 65Sim16)
- I haven't figured out a way to do BCD arithmetic. I'm not certain it's possible using this simulator, but I'm not certain it's impossible, either
- there doesn't seem to be a way to get the simulator to simulate external interrrupts, but I think I see a way to implement BRK, at least
I think I've learned enough so far that I can scrap it and start over with a clearer idea where I'm headed. But I do have a question regarding design. How many internal registers, ie., non-exposed, registers should there be?
The simulator is fairly lenient about letting you do whatever you want with any register at any time. Heck, they can all be ALUs, index and memory access at the same time, if you want and are willing to write all the necessary microcode. That doesn't seem realistic to me.
So far I'm using one register as basically an ALU and memory access register, mainly because so many instructions set the N and Z condition flags. I wrote the microcode to do that for one register and then decided it was easier to route any instruction that set flags through that register than to duplicate the microcode for every possible path.
This time I'm imagining perhaps five internal registers: memory address, memory data, ALU, shift and flagsetter. A separate flagsetter seems reasonable because memory reads and register transfers are not ALU or shift operations.
This is easy enough to get the simulator to do, but is it any more realistic than using one register (so far the memory data register) as memory read/write, ALU, shift and flagsetter?