Now that the clock-rate seems on target, it’s time to focus some attention on a little "PCB Feng Shui" - IOW, cleaning things up for good karma. Tolerances have become much tighter on the CPU, and it’s only prudent to corral any timing and signal integrity gremlins that might be lurking about.
Let’s take bus timing, for example - I was all over the map on this one, and I suppose early in the process, I could afford to be. Prior to the pipeline implementation, “enable” control signals came rather leisurely from ROM, and drivers, therefore, could hold their values stable on the bus long after the end of cycle - effectively providing ample hold-time for free. Meanwhile, signals to latch data into registers came sharply after the clock-edge, so latching was very reliable.
Now, however, “enable” control signals are dispatched directly from the Micro-Instruction Register just as sharply as latching signals, and they quickly disturb the very bus values we are trying to capture. In at least one case, the latching signal would lose this race predictably with fatal consequences. One obvious solution is to delay enable signals and thereby provide a more comfortable margin for latching. It’s tempting, but we have no such time to spare here - after all, the new data is needed on the bus as quickly as possible for the next operation. Any introduced delay simply hits the critical path of the following cycle. We’ve spent far too much effort hunting down nanosecond delays to start artificially introducing them now!
But other delays were also lurking on the bus unnoticed. A new cycle invariably implies a new driver on the bus, and signals to effect a change-over are delivered largely at the same time. The result is nasty “transient collisions” during the transition, as drivers push against each other until one finally abates and the other is allowed to proceed unfettered. Now, I know this kind of “bus contention” is commonly tolerated without consequence (after all the whole affair is usualy over in a flash, and if brief enough, its effects are quite benign). But, the fact remains that new data necessarily takes longer to stabilize on the bus under these difficult conditions than otherwise - I suspect materially so. Margins are thin, so I felt “waiting-out the storm”, so to speak, was not the best approach if it could be avoided.
Thankfully Dr Jefyll suggested an effective solution as follows: all drivers which share a common bus should be output-enabled for half a cycle only, such that every bus will switch from having a single driver active, to having no driver at all for a half-cycle. This is what a 6502 does externally on its data bus (which it drives only in phase 2). In this case, the approach works well for the W bus at the outputs of the ALU (since it too needs to be active only in phase 2, when the ALU has finished it’s work). But other buses, such as R and B at the inputs of the ALU, need to supply valid data early in phase 1 and hold it throughout the cycle. Initially, I simply could not see how this would work, until Jeff explained that bus capacitance would hold the data on the bus during the “dead period”. Once that penny dropped, the full effect of the mechanism was clear: drivers are always enabled on to a quiet bus, the pesky transient collisions are gone, transitions are smooth and fast, and latching is once again very reliable. Wonderful! As I said to Jeff, the learning never stops.
Now this scheme suffers from the unfortunate consequence that the CPU cannot be paused; since certain drivers are disabled at any one time, those buses would drain their charge and data would be lost - as happens on the NMOS 6502 from what I understand. To protect the buses, the CPU must either not be allowed to stop, or bus-hold ICs need to be installed (i.e., 74ACT1071 or 74ACT1073). I chose the latter option, simply because it seemed the more complete solution and it was dead-easy to implement.
All that remained then was to figure out which buses would be made active at which times - and that proved simple: drivers taking data data from registers to other logic early in the cycle are enabled during phase 1 only; meanwhile, drivers taking data to be latched into registers at the end of the cycle are enabled in phase 2 only. In the paths shown below, drivers to the R bus, B bus and ADL/ADH buses are enabled in phase 1, while drivers to the W bus are enabled in phase 2 (just to be clear, the external address bus, A.BUS below, is left active throughout the cycle).
REGISTER -> R.BUS -> ALU -> W.BUS -> REGISTER
REGISTER -> B.BUS -> ALU -> W.BUS -> REGISTER
REGISTER -> ADL/ADH -> A.BUS
Incidentally, I discovered that I was incorrectly driving the external Data Bus in phase 1 during write cycles - essentially causing collisions when data reverses direction between CPU and memory (Dr. Jefyll explains this phenomenon in a different context
here). I doubt this would have caused significant trouble, but it was easy to sort out. The fix was, as with all other drivers, to simply use either PHI1 or PHI2 to gate the enable signals as appropriate. It’s a simple and elegant solution to a potentially nasty problem and I’m glad to have implemented it.
Drass