BigEd wrote:
I suspect you might be underestimating the difficulty of boosting PC to 24 bits (and adding in a chunk of handling of 3 byte addresses. It might be informative, although I think not at all easy, to try to upgrade Arlet's simple 6502 HDL to this level. Not only would that give a measure at the level of HDL source complexity, but also after synthesis a measure at gate-count implementation.
https://github.com/lunarmobiscuit/verilog-65C2402-fsm proves you otherwise. That is a working implementation of the 65C2402 as envisioned in the first post on this thread, with all the ABS[,X|Y] opcodes supporting 3-byte addresses, plus JMP (ABS[,X]) and RTS and RTI.
Total clock time was approximately 10 hours, with half of that fighting with Verilog as I've not touched the language since 1999, plus I found one bug in Arlet's code that he fixed in 2-minutes but which a few hours of my time not realizing it didn't work in
https://github.com/Arlet/verilog-65C02-fsm.
I don't know how to synthesize Verilog into CMOS to compare gate counts, but comparing Arlet's 65C02 and my 65C2402 in terms of lines of code and size of the netlist:
Code:
- wc -l ab.v alu.v regfile.v cpu.v;
- 65C02 65C2402
- 94 120 ab.v
- 91 93 alu.v
- 59 74 regfile.v
- 684 757 cpu.v
- 928 1044 TOTAL
-
- iverilog -S -Nnetlist; wc -l netlist
- 1593 1775 netlist
All these changes was thus just 116 more lines of code, where rows (where at least 16 of those lines are new comments or commented out debugging code that I left in). So approx 11% more lines of code. In terms of netlist (which includes no comments) my design has 182 more items, which again is approx 11% larger. That feels about right given the amount of effort and extent of the changes.
From the README:
## Design goals
The main design goal is to show the possibility of a backwards-compatible 65C02 with a 24-bit
address bus, with no modes, no new flags, just two new op-codes: CPU and A24
$0F: CPU isn't necessary, but fills the A register with #$10, matching the prefix code
$1F: A24 does nothing by itself. Like in the Z80, it's a prefix code that modifies the
subsequent opcode.
When prefixed all ABS / ABS,X / ABS,Y / IND, and IND,X opcodes take a three byte address in the
subsequent three bytes. E.g. $1F $AD $EF $78 $56 = LDA $5678EF.
Opcode A24 before a JMP or JSR changes those opcodes to use three bytes to specify the address,
with this 24-bit version of JSR pushing three bytes onto the stack: low, high, 3rd. The matching
24-bit RTS ($1F $60) pops three bytes off the stack low, high, and 3rd.
RTI always pops four bytes: low, high, and 3rd for the IR, then 1 byte for the flags
(But Arlet's code doesn't support IRQ or NMI, so this CPU never pushes those bytes)
The IRQ, RST, and NMI vectors are $FFFFF7/8/9, $FFFFFA/B/C, and $FFFFFD/E/F.
Without the prefix code, all opcodes are identical to the 65C02. Zero page is unchanged.
ABS and IND addressing are all two bytes. Historic code using JSR/RTS will use 2-byte/16-bit
addresses.
The only non-backward-compatible behaviors are the new interrupt vectors. A new RST handler
could simply JMP ($FFFC), presuming a copy of the historic ROM was addressable at in page $FF.
A new IRQ handler similarly JMP ($FFFE). The only issue would be legacy interrupt handlers
that assumed the return address was the top two bytes on the stack, rather than three.
## Changes from the original
PC (the program counter) is extended from 16-bits to 24-bits
AB (the address bus) is extended from 16-bits to 24-bits
D3 (a new data register) is added to allow loading three-byte addresses
One new decode line is added for pushing the third byte for the long JSR
A handful of new states were added to the finite state machine that process the opcodes, in general
just one new state for handling ABS addresses, three-byte JMP/JSR, and three-byte RTS/RTI
------------
There isn't any assembler, so all the test code in ram.hex was hand-assembled. I read that Woz coded up the 6502 two characters at a time, as he had memorized all the opcodes and addressing modes. My 10 hour estimate includes doing the same as well as running the code and checking it not just opcode by opcode but state by state to make sure it works.
Try it out yourself and let me know what you think. The README provides instructions for running the testbed.
Thank you Arlet for the code to build upon as well as the 12-hour tech support turnaround.