Over in DrJeffyl's thread -
RDY vs CLOCK STRETCHING. Includes 2 very simple circuits - I posted a potential circuit for supporting clock stretching to match a slower, derived clock, and he asked that I make a separate thread for it as it's a different goal. So I'll elaborate more on that topic here and share my circuit, and I'm interested if anyone has other such circuits to share or examples of where it is useful.
One of Jeff's circuits used a '163 counter to perform the stretching, and also provided a regular unstretched clock output for driving VIAs. However, in some situations you may want this consistent clock to actually be slower than the normal CPU clock, and then you'd want the stretching behaviour to specifically bring the CPU clock in sync with the slower clock on certain cycles. For example, we may be using parts like 6522 VIAs that require a consistent clock signal like PHI2, but which don't support the speeds that we are running the CPU at. Or perhaps as in Paganini's case recently, we are struggling to get all of the address decoding done far enough before the normal rising edge of PHI2; and especially if some of the devices are 6522 VIAs, options are limited. This approach should support both those cases.
So the goal is to run PHI2 at a high frequency, e.g. 16MHz, and also generate a slower clock from the same source - let's call it SLOWCLK - at for example half of that frequency, 8MHz. When the CPU performs an operation on a slow device, we need to bring the two clocks in sync. In particular we want PHI2's falling edge to coincide with SLOWCLK's, but it's also beneficial if there's a healthy amount of SLOWCLK being low first, so that slow devices have plenty of time to react to being selected and the state of the address bus, and so that fine-grained address decoding has plenty of time to take place.
In normal operation then, PHI2 and SLOWCLK will be ticking along, but from time to time PHI2 will be held high to synchronise them:
Code:
WSE:0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 0 0 0 0
PHI2: 1 0 1 0 1 0 1 0 1 0 1 1 1 1 1 0 1 0 1 0
SLOWCLK: 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0
Here WSE is an input signal that we change in between clock ticks - it means wait-state enable (terminology taken from BDD's POC v1.4) and is an output of address decoding. It's generally low, indicating no need to stretch the clock, and PHI2 runs at its usual rate; but if WSE goes high before the usual falling edge of PHI2 then PHI2 is held high and the current cycle is extended to include a full low phase of SLOWCLK followed by a high phase.
Now we can connect SLOWCLK to e.g. the PHI2 input of a VIA. But we also need to make sure we activate the VIA's chip-select inputs at the right times. We can't do that purely based on the addresses coming from the CPU because that could overlap with the wrong SLOWCLK cycle. So we also need to generate a SLOWCS signal to gate these chip-selects with, like so:
Code:
WSE:0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 0 0 0 0
PHI2: 1 0 1 0 1 0 1 0 1 0 1 1 1 1 1 0 1 0 1 0
SLOWCLK: 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0
SLOWCS: 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0
We can connect SLOWCS to the VIA's active-high select line, or just include it in our regular address decoding process, and the VIA will only be activated during the correct slow clock cycle.
So that's the goal. This would be pretty easy to do in a PLD, and lots of people happily and successfully use PLDs for this sort of thing, but other people have their own different reasons for avoiding PLDs in clock generation, and the aim here is to cater for that, whatever the reason.
So here's the circuit I've designed to do it, extended from DrJeffyl's circuits:
Attachment:
File comment: Simple clock stretching to match a derived, half-rate clock signal
clockstretch_2x163_slowclk2.png [ 33.63 KiB | Viewed 7455 times ]
Note that in this circuit WSE has been inverted - so the signal feeding into the NAND gate is high if the cycle should be a fast cycle, and low if it should be slow.
As with DrJeffyl's circuit, the input clock oscillator is double the frequency that PHI2 generally ends up being (e.g. CLK=32MHz, PHI2=16MHz); but here we also have SLOWCLK which is half of the regular PHI2 frequency (e.g. SLOWCLK=8MHz). I have some other circuits for greater divisors, in case you need the slow clock to be even slower, but I'll post those separately.
The main change in this circuit - from DrJeffyl's 163-based circuit - is that counter U2 that generates PHI2 has its low bits loaded from counter U1, which generates SLOWCLK. These bits can be interpreted as a negative number which says how many clock ticks are needed before the falling edge of SLOWCLK - e.g. if they were both set, that would represent -1 in two's complement, imagining higher bits are all set, meaning that one more tick is needed before the falling edge of SLOWCLK - which is indeed the case because on the next tick U1's low bits will roll over to 00. However, as both U1 and U2 are driven by the same input clock, U2 is going to lag a clock cycle behind U1. To remedy that, the relevant output of U1 is passed through a separate D flip-flop (U4A) to form SLOWCLK before being used in the rest of the circuit, bringing it back into sync.
Whenever PHI2 is low, U2 will be reloaded with its high bit set, its next lower bit clear, and the lowest two bits taken from U1. This means that U2 will tick for 1-4 more cycles until a falling edge of SLOWCLK, then 4 more cycles, all with its top output bit (PHI2) high, and then wrap around to zero causing PHI2 to fall in sync with another SLOWCLK falling edge in at least 4 CLK ticks' time. These 4 guarranteed ticks ensure that there will definitely be a low and then high phase of SLOWCLK in the meantime.
U2's second-highest bit is initialised clear, but will get set after the next falling edge of SLOWCLK, and then stay set until PHI2 falls. This is exactly the signal we need for SLOWCS - it is high during the specific low-then-high phases of SLOWCLK, and low at all other times.
The final extension to DrJeffyl's circuit is the input to U2's /MR pin. If this is high, then U2 will tick through a stretched cycle as described above; but if it is low then U2 will get cleared to zero on the next CLK tick, causing PHI2 to go low. We use this to curtail the stretched cycles if PHI2 is high but the incoming WSE signal does not request stretching. This means that unstretched cycles only have one CLK tick in the high state before going low again.
Low phases of PHI2 are always just a single tick because U2's /PE pin is driven by PHI2, so when PHI2 is low we always load U2 with its top bit set on the next cycle.
I hope this explanation makes sense - there's quite a lot going on in a rather small circuit. I have built this on a breadboard for testing, and wired it up to an Arduino Mega to run a full test suite of all the various phases that PHI2 and SLOWCLK could be in. I'll post the output of that test program below, but here are some waveform diagrams to illustrate:
Attachment:
File comment: Waveforms for critical signals in the circuit, when WSE is asserted near a rising edge of PHI2 with at various points during SLOWCLK
clockstretch_2x163_slowclk_waveforms.png [ 57.86 KiB | Viewed 7455 times ]
This shows the critical signals in the circuit. It includes four separate PHI2 lines, covering the various phase relationships. In each case assume WSE was raised around the rising edge of PHI2 in one of these lines - it could occur before or after it without affecting the result. In the first PHI2 trace, this occurs at the same time as a falling edge of SLOWCLK, and we load U1 with the value -8, and wait in fact for two full cycles of SLOWCLK before continuing. This ensures we get a full low phase of SLOWCLK with SLOWCS high, even if WSE was late arriving. The next trace down shows the response when PHI2's rising edge was one CLK tick further in, and in this case U2 gets loaded with -7, and so on. The lower traces in this diagram show the states of the outputs of U2 for the PHI2(-8) case in particular.
Some of these phases are actually going to be quite rare - they'd only occur on initial startup, as once PHI2 has been synchronised with SLOWCLK once, it will only change phase by a multiple of two CLK cycles at a time. But it's still important to handle those cases, otherwise it would never get in sync, so my the Arduino has a way to force it to happen, and test the result. the Arduino code also tests WSE rising or falling on both sides of the leading edge of PHI2, to make sure both behave the same way.
I have designed this circuit into a simple 6502-based computer to test it out further, but not built it yet:
Attachment:
File comment: Full 6502-based computer schematic based on this clock stretching technique
clockstretch_2x163_6502basedcomputer.png [ 107.63 KiB | Viewed 7455 times ]
I'm expecting this to be able to run into the 30+MHz range, as the CPU/RAM core is the same as in my fast PDIP design and there aren't too many I/O devices to bog things down. The address decoding is also quite coarse but it should be fine to tighten that up and add more RAM in the upper half of the address space. It will make WSE more complex to determine, but it doesn't really matter much if that is slow, so long as it's ready by the end of the usual PHI2-high phase.
Here's the output of the test program, showing the critical signal states in each test case, which might illustrate how these things work together. Note that in this output WSE is active-high again, not inverted.
Code:
Testing normal fast cycles
WSE:0 0 0 0 0 0 0 0
PHI2: 1 0 1 0 1 0 1 0
SLOWCLK: 0 1 1 0 0 1 1 0
SLOWCS: 0 0 0 0 0 0 0 0
Testing slow cycle with delay 0
WSE:1 1 1 1 1 1 1 1 0 0 0 0
PHI2: 1 1 1 1 1 1 1 0 1 0 1 0
SLOWCLK: 0 1 1 0 0 1 1 0 0 1 1 0
SLOWCS: 0 0 0 1 1 1 1 0 0 0 0 0
Testing slow cycle with delay 1
WSE:0 1 1 1 1 1 1 1 0 0 0 0
PHI2: 1 1 1 1 1 1 1 0 1 0 1 0
SLOWCLK: 0 1 1 0 0 1 1 0 0 1 1 0
SLOWCS: 0 0 0 1 1 1 1 0 0 0 0 0
Testing slow cycle with delay 2
WSE:0 0 1 1 1 1 1 1 0 0 0 0
PHI2: 1 0 1 1 1 1 1 0 1 0 1 0
SLOWCLK: 0 1 1 0 0 1 1 0 0 1 1 0
SLOWCS: 0 0 0 1 1 1 1 0 0 0 0 0
Testing slow cycle with delay 3
WSE:0 0 0 1 1 1 1 1 0 0 0 0
PHI2: 1 0 1 1 1 1 1 0 1 0 1 0
SLOWCLK: 0 1 1 0 0 1 1 0 0 1 1 0
SLOWCS: 0 0 0 1 1 1 1 0 0 0 0 0
Testing two slow cycles in a row
WSE:1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
PHI2: 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 0
SLOWCLK: 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0
SLOWCS: 0 0 0 1 1 1 1 0 0 0 0 1 1 1 1 0
Testing delayed fall of SLOW after slow cycle
WSE:1 1 1 1 1 1 1 1 1 0 0 0
PHI2: 1 1 1 1 1 1 1 0 1 0 1 0
SLOWCLK: 0 1 1 0 0 1 1 0 0 1 1 0
SLOWCS: 0 0 0 1 1 1 1 0 0 0 0 0
Testing putting PHI2 and SLOWCLK out of phase
WSE:0 0 0 0
PHI2: 0 1 0 1
SLOWCLK: 0 1 1 0
SLOWCS: 0 0 0 0
Testing slow cycle from out-of-phase with delay 0
WSE:1 1 1 1 1 1 1 1 0 0 0 0
PHI2: 1 1 1 1 1 1 1 0 1 0 1 0
SLOWCLK: 0 1 1 0 0 1 1 0 0 1 1 0
SLOWCS: 0 0 0 1 1 1 1 0 0 0 0 0
Testing slow cycle from out-of-phase with delay 1
WSE:0 1 1 1 1 1 1 1 0 0 0 0
PHI2: 0 1 1 1 1 1 1 0 1 0 1 0
SLOWCLK: 0 1 1 0 0 1 1 0 0 1 1 0
SLOWCS: 0 0 0 1 1 1 1 0 0 0 0 0
Testing slow cycle from out-of-phase with delay 2
WSE:0 0 1 1 1 1 1 1 0 0 0 0
PHI2: 0 1 1 1 1 1 1 0 1 0 1 0
SLOWCLK: 0 1 1 0 0 1 1 0 0 1 1 0
SLOWCS: 0 0 0 1 1 1 1 0 0 0 0 0
Testing slow cycle from out-of-phase with delay 3
WSE:0 0 0 1 1 1 1 1 1 1 1 1 0 0 0 0
PHI2: 0 1 0 1 1 1 1 1 1 1 1 0 1 0 1 0
SLOWCLK: 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0
SLOWCS: 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0
Testing two slow cycles in a row from out-of-phase
WSE:0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
PHI2: 0 1 1 1 1 1 1 0 1 1 1 1 1 1 1 0
SLOWCLK: 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0
SLOWCS: 0 0 0 1 1 1 1 0 0 0 0 1 1 1 1 0