Page 2 of 3
Re: Generating Wait-States with Clock Trickery
Posted: Tue Aug 07, 2018 9:28 pm
by BigDumbDinosaur
Well, the W65C265S does this, but also, potentially, a bunch of other stuff.
A real timing quagmire. If I were going the microcontroller route it would not be with a 65C256.
Generating Wait-States with Clock Trickery: From Theory to P
Posted: Sat Apr 25, 2020 6:27 am
by BigDumbDinosaur
Well, here it is nearly two years since anything was posted to this topic. With COVID-19 keeping most of us at home and out of work (or working less), there are opportunities to monkey with our computing gadgets, So it is with moi.
To date, I hadn't really had an opportunity to test the theory I presented earlier in this topic. However, with all this downtime on my hands, plus the next iteration of the POC series being on the drawing board, I decided it was time to build a gadget to prove or disprove my ideas. Here's the schematic for it:

- Clock Stretch Wait-State Test Circuit
This is a simple circuit whose sole purpose in life is to demonstrate a practical way to stretch the clock for slow hardware. To assist in connecting it to the logic analyzer, J2 is an 8-pin header for picking off signals that are of interest.
This contraption generates three clock signal outputs, one which is the "global" clock (GCLK) that runs at half the speed of the oscillator. Two out-of-phase clocks, Ø1 and Ø2, are derived from one section of the 74AC74 flip-flop (U3). Ø1 and Ø2 would be needed in a 65C816 system to drive the MPU, as well as gate a bank latch and data bus transceiver. Ø2 is in phase with GCLK and Ø1 is 180° out of phase with Ø2.
That little gadget at the left end of the schematic is a push button. When pressed it will momentarily pull the AC109's K input low, which will trigger the wait-state by pulling the PRE input of U3b low for one entire GCLK cycle. While PRE is low Ø2 will be continuously high and Ø1 will be continuously low, causing the high phase of that clock period to double in length. At the end of the GCLK cycle, PRE will be driven high and U3b will return to normal operation.
Here is an annotated logic analyzer trace of this circuit in action:

- Wait-State Test Results
Testing was done with a 2 MHz oscillator, which results in the three output clocks running at 1 MHz. The annotations should be self-explanatory.
In a practical application, the K input to U2a (/WSE on the annotated trace) would be connected into the glue logic in such a way that it would be pulled low when a wait-state is needed. Due to the manner in which this circuit operates, K must be pulled low before the rise of Ø2. In simple terms, chip selects of devices that are to be wait-stated must not be qualified by Ø2, else this circuit will not work.
This circuit can be used with the NMOS 6502, as well as the 65C02 and 65C816, although it is targeted to the requirements of an '816 system with extended RAM (Jeff's clock stretcher using a synchronous counter is a simpler choice for a 65(c)02 system or a 65C816 without extended RAM—it's a one-chip solution that I used in POC V1.2). Note that only the high phase of Ø2 is stretched. This should suffice for most applications.
In a system with one or more 65C22s, GCLK should be used to drive the 65C22s' Ø2 inputs. Driving them from Ø2 will result in the timers gradually slowing down with each wait-state. Also, the synchronous shift registers, if used, will "malfunction" if Ø2 is stretched. If you are using WDC's 65C21, 65C22 or 65C51 there is no need to wait-state any of them, as they are rated for 14 MHz operation.
Re: Generating Wait-States with Clock Trickery
Posted: Sun May 23, 2021 7:13 am
by BigDumbDinosaur
This circuit has been put into practical application in POC V1.3.
Re: Generating Wait-States with Clock Trickery
Posted: Thu May 12, 2022 12:02 pm
by fredericsegard
Hi BDD,
This circuit has been put into practical application in POC V1.3.
I implemented your circuit to try it out. Except for a minor change, as you can see in the attached diagram. I replaced the 74AC109 with a 74AC112 (because that's what I had on hand), which is functionally the same, except the K input is inverted. There's also a 74HC193 4-bit counter that serves as a clock speed selector, that I have bypassed during debugging (wiring up pin5 of the 20MHz clock directly to pins 3 and 11 of the 74AC74.
The problem I have is this. When I hardwire WSEN low, I get the expected result of 10MHz, with a 50ns LOW and 50ns HIGH period. When I set WSEN high, I get 5MHz, but with a duty cycle of 175ns HIGH, and 25ns LOW. According to your timing diagram, this shouldn't be, it should be 50%/50%. I double-checked my wiring, but all seems according to the schematic.
Any thoughts?
Re: Generating Wait-States with Clock Trickery
Posted: Thu May 12, 2022 12:14 pm
by Dr Jefyll
I replaced the 74AC109 with a 74AC112 (because that's what I had on hand), which is functionally the same, except the K input is inverted.
That's not the only difference. The clock on a '109 triggers on the rising edge, whereas that on a '112 is falling-edge triggered.
-- Jeff
Re: Generating Wait-States with Clock Trickery
Posted: Thu May 12, 2022 3:13 pm
by BigDumbDinosaur
I implemented your circuit to try it out. Except for a minor change, as you can see in the attached diagram. I replaced the 74AC109 with a 74AC112 (because that's what I had on hand), which is functionally the same, except the K input is inverted.
That's not the only difference. The clock on a '109 triggers on the rising edge, whereas that on a '112 is falling-edge triggered.
As Jeff noted, the AC112 is clocked on the falling edge. That characteristic would explain what you are observing.
Both Digi-Key and Mouser stock the 74AC109 in single-piece quantities, in DIP and SOIC.
Re: Generating Wait-States with Clock Trickery
Posted: Thu May 12, 2022 4:15 pm
by Proxy
So what i gathered from this thread is:
Pulling RDY Low is functionally identical to holding the CPU's PHI2 pin high, so you can say that it stretches the High Phase of the CPU's internal clock until the next falling edge when RDY is high again.
The problem with that is the fact that it's all inside the CPU so any decoding logic that also uses PHI2 (RD/WR signals for Memory, IO, and the Bank Address Latch) will continue to run like normal since the clock is uneffected by RDY.
I ran into this same issue when doing my wait state generation for my 65C02 SBC since i also wanted to run at 20MHz.
my solution is very similar to your own BDD, but instead of using 2 D-Flip Flops I only use 1 for the wait state generation and then OR the negated RDY output with PHI2 to create the creatively named PHI2_D that automatically gets it's high phase stretched whenever the RDY output is low.
The RDY output goes to the CPU's RDY pin and PHI2_D is used exclsuviely for the decoding logic, though as it was mentioned before RDY is just stretching the internal CPU clock so you could throw that output away, pull the RDY pin high with a resistor, and use PHI2_D for both the decoding logic and the CPU.
The timing diagram should look something like this:
But I'm using this circuit in a CPLD so I don't know how different the timings would be when you use discrete logic ICs... for example the propergation delay of the OR gate could be an issue at +30MHz
also please excuse my hand drawn diagrams, i don't know what fancy programs you guys have to make those circuits and timing diagrams so i just use GIMP.
Re: Generating Wait-States with Clock Trickery
Posted: Thu May 12, 2022 7:58 pm
by akohlbecker
Proxy, I go into the RDY pin in great detail in my video series, starting with
Episode 5 and
Episode 6 where I talk about what it is in the context of the 65C816 bus demultiplexing, and continuing in
Episode 18 and
Episode 19 where I use it to generate Read and Write pulses. I specifically pay close attention to the timing delays it generates. You might find this interesting!
I have yet to use it for wait states, but at this point, given that my glue logic has been designed with this pin in mind, it will be pretty easy to do. Mostly connect the slow I/O chip select to RDY_IN with a flip flop or a counter. I just have to make sure that the address decoding + the wait state counting don't delay the signal past the rising edge of the clock.
Here is how I handle the RDY pin, turning it from a bi-directional pin to two unidirectional pins RDY_IN and /WAI. I'm also making the input go through a flip flop, so the logic can stabilize before the CPU reads RDY and stay consistent throughout the cycle
Here is how I generate my clock phases, taking into account the registered RDY input. I've had to generate a few phases to satisfy the hold times on reads and on writes. All phases are OR-ed with RDY similarly to what you're doing.
And finally, here is how those phases are used for read-write pulses
This is in discrete logic and so relatively slow. I could scale it up to probably 6MHz in spec, more with overclocking. At the moment my address decoding is the limiting factor. But if you're on a CPLD I suspect it can scale quite a bit more.
All in all, I think this is a good base to have in the glue logic to be ready (!) for wait states!

Re: Generating Wait-States with Clock Trickery
Posted: Thu May 12, 2022 9:29 pm
by Dr Jefyll
I'm also making the input go through a flip flop, so the logic can stabilize before the CPU reads RDY and stay consistent throughout the cycle
I'm not sure why you feel the flip flop is necessary. Its input comes from a memory-address decoder, isn't that right?
My own approach would be for the decoder output to directly (ie, w/o any flipflop) cause the CPU's RDY pin to be pulled low. And if you pull the CPU's RDY pin low then you're assured that the address won't change when the cycle ends, and hence the decoder's output will remain stable.
-- Jeff
Re: Generating Wait-States with Clock Trickery
Posted: Thu May 12, 2022 10:00 pm
by akohlbecker
I'm also making the input go through a flip flop, so the logic can stabilize before the CPU reads RDY and stay consistent throughout the cycle
I'm not sure why you feel the flip flop is necessary. Its input comes from a memory-address decoder, isn't that right?
My own approach would be for the decoder output to directly (ie, w/o any flipflop) cause the CPU's RDY pin to be pulled low. And if you pull the CPU's RDY pin low then you're assured that the address won't change when the cycle ends, and hence the decoder's output will remain stable.
-- Jeff
There is a window around the falling edge of the clock where the CPU is reading RDY. It can happen that RDY goes low during this window and whether the CPU actually registers it is then undefined. However, the rest of the glue logic will still change state. Specifically, with my 65C816, I want the bank address latch to open if the CPU is proceeding with another cycle, but stay closed otherwise. The Latch Enable signal is combinatorial and not synchronized with the CPU's view of RDY. Similarly, the read-write pulses also need to toggle if the CPU is proceeding. So I added a flip-flop in front to ensure either all of them see a low RDY or none of them, in a given cycle.
You're right that wait states are synchronized with the clock and probably don't need this safety, provided your signal propagates outside of the CPU window. However, other use cases can trigger RDY at any moment, like DMA. Ben Eater's VGA card, for example, pauses the CPU while the display is active and it is reading from memory, independently from the main CPU clock. Thinking about RDY and BE from the perspective of using this card and making it safe, has been an interesting challenge for me!
Re: Generating Wait-States with Clock Trickery
Posted: Sat May 14, 2022 1:40 pm
by Dr Jefyll
There is a window around the falling edge of the clock where the CPU is reading RDY. It can happen that RDY goes low during this window and whether the CPU actually registers it is then undefined.
That's true. But (going slightly OT) let me remind everyone that in this case "undefined" doesn't necessarily resolve to a simple, "yes or no" outcome. (I used to believe that, but I've learned there's more to it.)
In cases where RDY goes high or goes low during the danger zone, one might suppose that the CPU will either immediately recognize the new level or instead simply recognize it on the subsequent cycle... IOW, a tidy result in either case. But there's a third possible outcome. When tPCS or tPCH is violated
the result can be messy, resulting in a program crash.

There's a very clear (to me) instance of this described in
this thread, from which I quote...
I'm wondering if anyone is successfully using Rdy to halt a 65C02?
I tried it some time back on my Ruby boards and had mixed results - in that it worked very well - I was pulling Rdy low, then BE low, then fiddling with the RAM from another processor (an ATmega), then releasing BE then taking Rdy high ... At that point, sometimes the 65C02 would crash - it's as if it was reading a random instruction at that point.
RDY was being pulled low and high by an asynchronously-clocked source (the ATmega), which means tPCS or tPCH would have a certain probability of being violated and also a certain probability of being satisfied (this being the most likely outcome, as the off-limits tPCS to tPCH window is quite brief). And, this matches what Gordon reported -- ie; most of the time it worked, but occasionally it failed,
actually disrupting the program.
Perhaps tPCS/tPCH violations cause
metastability issues within the CPU; I don't know. But another (more likely?) explanation is that RDY is a signal that gets internally routed to multiple sections of the CPU. For example, PC needs to be told not to increment, and updates to X, Y and A need to be postponed. It's easy to imagine a jumbled outcome if (for example) PC did get the message in time but some other part of the CPU didn't!
-- Jeff
Re: Generating Wait-States with Clock Trickery
Posted: Sat May 14, 2022 2:30 pm
by gfoot
Perhaps tPCS/tPCH violations cause
metastability issues within the CPU; I don't know. But another (more likely?) explanation is that RDY is a signal that gets internally routed to multiple sections of the CPU. For example, PC needs to be told not to increment, and updates to X, Y and A need to be postponed. It's easy to imagine a jumbled outcome if (for example) PC did get the message in time but some other part of the CPU didn't!
I think this is very likely the case, and as I understand it even if the CPU does internally latch RDY to try to ensure everything on the chip has a consistent view of it, metastability in the latch can still cause its output to transition very slowly, which would still cause the issues you describe elsewhere in the chip.
I guess adding yet another latch to help guard against metastability was a straw that would break the camel's back, and it's more cost effective to require external latching for the minority of cases where this matters (given that many 6502 uses never touch RDY).
Re: Generating Wait-States with Clock Trickery
Posted: Sat May 14, 2022 3:16 pm
by akohlbecker
Good points all around!

Re: Generating Wait-States with Clock Trickery
Posted: Sat May 14, 2022 4:17 pm
by Dr Jefyll
it's more cost effective to require external latching for the minority of cases where this matters (given that many 6502 uses never touch RDY).
Yes -- and even the users who
do use RDY typically drive it with a signal from a memory-address decoder, and the transitions of such a signal have a fairly predictable relation to the 6502 clock (because it's the 6502 that generates the addresses). In such cases the designer can reliably accommodate tPCS and tPCH without the need for external latching.
-- Jeff
Re: Generating Wait-States with Clock Trickery
Posted: Mon May 16, 2022 12:35 pm
by fredericsegard
I admit I don't full understand what has been posted since my last post, but I do get the general gist of it.
Understandably, timing is everything. I think I was looking for a simple solution to a rather complex problem. I definitely will need to do more homework on this.
I want to thank all that chimed in on this thread.