Page 1 of 1

Clock-stretching circuits for slow peripherals

Posted: Fri Apr 04, 2014 4:51 pm
by BigEd
It's fairly common that 6502-family designs have fast CPUs and fast memory, but slow peripherals.

For some purposes, the RDY input is enough to slow down accesses as needed, but the original NMOS 6502 doesn't respond to RDY for write cycles, so that might need special handling. (See http://visual6502.org/JSSim/expert.html ... 12&rdy1=16 )

There are probably many ways to fiddle the clock signal, either syncing to a slower free-running clock or suppressing pulses. Here's Acorn's circuit for the BBC Micro:
Clocking 6502 in BBC Micro.png
Unfortunately this method may result in duplicate accesses - see App Note 3 or Figure 28.2 on page 443 of the Advanced User Guide:
1MHz bus decoding in BBC Micro.png
Anyone got a reliable, simple and glitch-free solution to the problem of accessing slow peripherals?

Re: Clock-stretching circuits for slow peripherals

Posted: Fri Apr 04, 2014 5:14 pm
by BigDumbDinosaur
Commodore had addressed (!) this issue in the C-128, in that all I/O block ($D000-$DFFF) accesses were effectively clocked at 1 MHz, regardless of whether the 8502 was running at 1 or 2 MHz. They employed a clock stretching scheme that actually lengthened both the Ø1 and Ø2 clock states to result in an effective one microsecond cycle time. I've never looked at the circuitry, so I don't know precisely what was done to achieve this.

As you note, the NMOS 6502 will not respond to RDY being asserted during a write cycle. However, couldn't the notion of cascading flip-flops, as would be otherwise be used to control RDY, be put to work to reduce the effective clock rate during the period of I/O access? As long as the clock phase is not changed when slowed down or restored, I'd have to think that the MPU would be none the wiser and would simply take longer to complete the access.

On another thought, would it be practical to substitute a 65C02 for the NMOS part and thus gain an MPU that does respond to RDY on write cycles?

Re: Clock-stretching circuits for slow peripherals

Posted: Fri Apr 04, 2014 6:24 pm
by BigEd
Fair point, using a CMOS part would surely make life a lot easier. I have a gut feeling that Acorn's approach (which did have to work with the NMOS part) is over-complex, and yet it's quite beyond me to come up with anything better.

Re: Clock-stretching circuits for slow peripherals

Posted: Fri Apr 04, 2014 7:20 pm
by cbscpe
Once upon a time when I run into the problem that memory was fast but IO and ROM slow I was thinking of the following solution, but never have implemented it. Excuse the format, but this is a copy of my old "notes" made in a normal Text editor.

Code: Select all

11.1. New Clock 
---------------------------------------------------------------------------

The new clock implements a state machine which allows to
dynamically switch between a Full and Half-frequency PHI2 cycle


The state machine has 2 bits, PHI2 and PHI3 which implement a
state machine with 4 states, it will use the already existing
74AC74.

The statemachine requires a input to select between the 
fast and the slow cycle. There are 2 versions of the state
machine depending whether the signal is Low or High for 
fast cycles

Version A: /SLOW is asserted (LOW) for slow cycles


            74F153         74F74


            +-----+         +-----+
   /SLOW----|I0   |  +------|     |--+--- PHI2
       VCC--|I1  Y|--+      | ST2 |  |
       VCC--|I2   |       +-|>    |  |
       GND--|I3  E|o-+    | |     |  |
            |A   B|  |    | |     |o----- /PHI2
            +-----+  =    | +-----+  |
             |   |        |          |
             +---|-------------------+
             |   |        |
             |   +-------------------+
             |   |        |          |
            +-----+       | +-----+  |
       VCC--|I0   |  +------|     |--+
       VCC--|I1  Y|--+    | | ST3 |
       GND--|I2   |       +-|>    |
       GND--|I3  E|o-+    | |     |
            |     |  |    | |     |o--
            +-----+  =    | +-----+
                          |
          ----------------+



ST2	0	1	0	0	1	1	0
ST3	0	1	0	1	0	1	0
State	0 --->	3 --->	0 --->	2 --->	1 --->	3 --->	0

The change from 0 depends on the input signal


		ST3	ST2
0 --->	3	1	1	Fast
0 --->	2	1	0	Slow

The following state changes are fixed

		ST3	ST2
1 --->	3	1	1
2 --->	1	0	1
3 --->	0	0	0

As we can see ST3 always toggles so we can also use /ST3
as input to D3, D2 has a more complex logic

D2	= /ST3	* /ST2	* /SLOW
	+ /ST3	*  ST2
	+  ST3	* /ST2


Discussion of the delay times

There is a critical path regarding the decision of fast and slow
cycles. The following times are relevant

Name	Time	Description
tas	30ns	Address setup time of the CPU (W65C816)
tpd	 7ns	Propagation Delay of Bank address latch (74F573)
tpd	 7ns	Propagation Delay of the decoding of /SLOW respectively /FAST
tpd	10ns	Propagation Delay of MUX (74F157)
tsu	 3ns	Setup Time of State Flip-Flops (74F74)
	----
Total	57ns

thus this logic limits the Fast Clock to 8.77MHz. In order to speed
up the logic the easiest is to put the decoding, state selection
and the flip-flops into a fast CPLD

Name	Time	Description
tas	30ns	Address setup time of the CPU (W65C816)
tpd	 7ns	Propagation Delay of Bank address latch (74F573)
tsu	 7ns	Setup Time of CPLD
	----
Total	44ns

this limits the Fast Clock to 11.36MHz.

The CPLD has the advantage to reduce the number of required chips and can
as well create other selection signals.

Or use a dedicated CPLD and integrate Flipflops

Name	Time	Description
tas	30ns	Address setup time of the CPU (W65C816)
tsu	 5ns	Setup Time of CPLD
	----
Total	35ns

which would support the full speed 14MHz Clock of the W65C816.

Re: Clock-stretching circuits for slow peripherals

Posted: Fri Apr 04, 2014 9:14 pm
by BigDumbDinosaur
BigEd wrote:
Fair point, using a CMOS part would surely make life a lot easier. I have a gut feeling that Acorn's approach (which did have to work with the NMOS part) is over-complex, and yet it's quite beyond me to come up with anything better.
The problem for Acorn, I think, was two-fold: they had to effectively change the Ø2 frequency without inadvertent phase shift or level glitches, and they had to do it with the available logic of the day. Had they had, say, GALs available at the time it would have resulted in a lot less circuitry.

Re: Clock-stretching circuits for slow peripherals

Posted: Fri Apr 04, 2014 9:26 pm
by BigEd
Well, there are already 2 pretty complex ULAs on the board - not sure why but there are still quite a number of 74 series parts too. (See http://www.8bs.com/inbbc.htm)

Re: Clock-stretching circuits for slow peripherals

Posted: Fri Apr 04, 2014 9:35 pm
by GARTHWILSON
We've had a few discussions about this over the years but I'm not finding it easy to round them up.  I posted a circuit at viewtopic.php?f=4&t=1370&start=262 (in one of ElEctric_EyE's long topics) that uses a speed input bit to selectively skip every other low pulse, without glitches at the switching.  Don't use the 4000-series logic I show there—I used that only because I wanted slow-enough logic that if there were any glitches or runt pulses, my slow oscilloscope could catch them.  The circuit uses a flip-flop and a quad NOR, which you would want to select from a faster family.

Re: Clock-stretching circuits for slow peripherals

Posted: Fri Apr 04, 2014 9:40 pm
by BigEd
That's good -please could you attach it here? I did a little searching before posting, but didn't find that one.

Edit: Hmm, it only slows one phase, so it might not work with peripherals that use both edges of the clock...

Re: Clock-stretching circuits for slow peripherals

Posted: Fri Apr 04, 2014 9:54 pm
by BigDumbDinosaur
BigEd wrote:
Well, there are already 2 pretty complex ULAs on the board - not sure why but there are still quite a number of 74 series parts too. (See http://www.8bs.com/inbbc.htm)
From today's point of view, it's amazing how much hardware was required back then to create a reasonable computer. You could probably fit six of my POC units in the same area taken up by the Acorn's motherboard. Of course, the Acorn had color video and sound...but even so, it took a lot of discrete silicon to achieve it.

Re: Clock-stretching circuits for slow peripherals

Posted: Fri Apr 04, 2014 10:04 pm
by GARTHWILSON
BigEd wrote:
That's good -please could you attach it here? I did a little searching before posting, but didn't find that one.
Image
Quote:
Edit: Hmm, it only slows one phase, so it might not work with peripherals that use both edges of the clock...
You might be able to get the timing accurate enough by adding a non-inverting gate of some kind that gets its input at the same place the top-right NOR does (which is only being used as an inverter anyway), or by using an OR gate getting its inputs at the same place the second-to-last NOR does, namely the clock source and the flipflop's Q output. The extra gate could be of a different 74 family if necessary to try to match timings.

Re: Clock-stretching circuits for slow peripherals

Posted: Sat Apr 05, 2014 1:59 am
by ElEctric_EyE
Check out the Maxim IC DS1085L. Was for 3.3V operation and SPI programming, but frequency changing was glitch free. Also I think they had different operating voltage options.

Re: Clock-stretching circuits for slow peripherals

Posted: Sat Apr 05, 2014 10:01 am
by BigEd
One thing to note about Acorn's solution: they run the VIAs with a free-running 1MHz clock, so the timer driven interrupts have a solid timebase. This presents an extra problem, because this free-running clock can be in one of two states at the point that a slow cycle is needed.

Re: Clock-stretching circuits for slow peripherals

Posted: Sat Apr 05, 2014 11:58 am
by Rob Finch
A little OT but one could build a bus bridge to interface to the slower peripherals rather than clock stretching. It seems to me the way to do
the interfacing is by controlling the RDY signal to the processor.

For write cycles the data to the peripheral would need to be latched, and a small state-machine used to generate a write strobe to the peripheral.
A write fifo could also be used. The processor would write to the fifo which would then trigger a write to the actual peripheral. One would have to
be careful with the software not to perform consecutive writes too quickly to the same peripheral. OR the state-machine could generate an
appropriate RDY signal.

For read cycles a read data latch could be used along with a state-machine used to drive the RDY line to the processor.
OR two read cycles could be used. The first cycle triggers a read of the peripheral into latches. Then the second read cycle reads the latches
at full speed. But this solution is visible to the software.

There are bi-directional bus latches ('646's?) that would help.

Re: Clock-stretching circuits for slow peripherals

Posted: Sat Apr 05, 2014 12:19 pm
by BigEd
Yes - when the bus takes over the job of seeing that the writes reach their destination, the fact that RDY doesn't stall writes (in the NMOS part) becomes a non-issue. Unfortunately, you can see as many as three back to back writes, if you look at all cases. But for normal cases of access to peripherals, you'll only see a single write and you won't be able to read back from that address until after at least one read.