Meanwhile, I’ve been mulling what the next step should be in my computing activities. I need to get some software projects done, especially an operating system kernel, but also need a POC unit with more RAM to test that kernel, since one of its design features will be support for preemptive multitasking. In that regard, the initial plan is to “sandbox” processes in individual banks, which means I can simultaneously run as many processes as the system has extended banks. Later on as I continue kernel development, I will figure out how to run multiple processes in one bank.
Getting back to hardware, the next POC unit, V1.5, will be built around discrete logic, since none of the CPLD-based designs I’ve built to date has been stable, and I don’t want to burn up more time on what is, for now, a dead end. A 512KB unit is feasible with discrete logic, although it will be somewhat tricky to avoid timing troubles due to the more complex glue logic needed to prevent ROM and I/O hardware from being mirrored in extended banks (banks $01 through $07 in this case).
In POC V1.3, which is all discrete, the state of a signal called BNK0 tells the glue logic if the effective address is $00xxxx. If the expression BNK0 & A15 & A14 is true (where & is logical AND), I/O ($00C000-$00C7FF, mirrored at $00C800-$00CFFF) or ROM ($00D000-$00FFFF) is being addressed. That being the case, a wait-state at any Ø2 frequency above approximately 12.5 MHz would be required. Wait-states are generated by stretching the high phase of Ø2 for an additional clock cycle, which requires that the decision to wait-state be made well before the clock goes high. Of course, such a decision can't be made until the 65C816 has emitted A16-A23 on D0-D7 and those bits have been propagated through the transparent bank latch—a 74AC573, in POC V1.3.
The bank latch’s output is transformed into the BNK0 term in the BNK0 & A15 & A14 equation. In V1.3 and V1.4, BNK0 = !A16, since there is only 128KB of RAM. In V1.5 with its 512KB, the bank latch has to emit A16, A17 and A18, which means BNK0 = !(A16 | A17 | A18) (where | is logical OR). Hence the design question became one of figuring out how to generate BNK0 without using a lot of extra logic.
In considering this problem vis a vis V1.5, it seems the logical (!) way to generate BNK0 is with a three-input NOR gate, whose inputs are connected to the output of the bank latch. This gate would have to be very fast, since its prop time will be added to that of the latch, whose prop time would be added to the 65C816's tBAS (bank setup time). The 74LVC1G27, which is a three-input NOR with a very small prop time (TPD is 3.5ns maximum on 5 volts), would be ideal for this application...if it weren’t for its minuscule SOT-23 package. There are those who can manually solder SOT-23 (and smaller). I am not a member of that august group.
Along with the generation of BNK0, there is wait-stating to be considered. Having taken a fresh look at clock generation, I’ve concluded that the stretching methods concocted by Jeff (74xx163 synchronous counter) and me (a 74AC109 JK flop controlling a 74AC74 CD flop) are essentially equal in timing constraints, mainly in the available setup time before the rise of Ø2. Jeff’s method is a single-chip solution and is theoretically faster than mine in 74AC logic, which is the fastest logic in which a 5 volt version of the xx163 is available. However, if I can produce equivalent operation with a 74xx74, I’d have a single-chip solution like the synchronous counter, but with the possibility of using 74AHC logic, which aside from being marginally faster than 74AC in some respects, has less-aggressive output edges, which helps with the suppression of ringing.
So I've got some thinking to do.