6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Thu Nov 14, 2024 9:23 pm

All times are UTC




Post new topic Reply to topic  [ 14 posts ] 
Author Message
PostPosted: Fri Apr 04, 2014 4:51 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10977
Location: England
It's fairly common that 6502-family designs have fast CPUs and fast memory, but slow peripherals.

For some purposes, the RDY input is enough to slow down accesses as needed, but the original NMOS 6502 doesn't respond to RDY for write cycles, so that might need special handling. (See http://visual6502.org/JSSim/expert.html ... 12&rdy1=16 )

There are probably many ways to fiddle the clock signal, either syncing to a slower free-running clock or suppressing pulses. Here's Acorn's circuit for the BBC Micro:
Attachment:
Clocking 6502 in BBC Micro.png
Clocking 6502 in BBC Micro.png [ 154.44 KiB | Viewed 2465 times ]

Unfortunately this method may result in duplicate accesses - see App Note 3 or Figure 28.2 on page 443 of the Advanced User Guide:
Attachment:
1MHz bus decoding in BBC Micro.png
1MHz bus decoding in BBC Micro.png [ 150.71 KiB | Viewed 2465 times ]


Anyone got a reliable, simple and glitch-free solution to the problem of accessing slow peripherals?


Top
 Profile  
Reply with quote  
PostPosted: Fri Apr 04, 2014 5:14 pm 
Offline
User avatar

Joined: Thu May 28, 2009 9:46 pm
Posts: 8485
Location: Midwestern USA
Commodore had addressed (!) this issue in the C-128, in that all I/O block ($D000-$DFFF) accesses were effectively clocked at 1 MHz, regardless of whether the 8502 was running at 1 or 2 MHz. They employed a clock stretching scheme that actually lengthened both the Ø1 and Ø2 clock states to result in an effective one microsecond cycle time. I've never looked at the circuitry, so I don't know precisely what was done to achieve this.

As you note, the NMOS 6502 will not respond to RDY being asserted during a write cycle. However, couldn't the notion of cascading flip-flops, as would be otherwise be used to control RDY, be put to work to reduce the effective clock rate during the period of I/O access? As long as the clock phase is not changed when slowed down or restored, I'd have to think that the MPU would be none the wiser and would simply take longer to complete the access.

On another thought, would it be practical to substitute a 65C02 for the NMOS part and thus gain an MPU that does respond to RDY on write cycles?

_________________
x86?  We ain't got no x86.  We don't NEED no stinking x86!


Last edited by BigDumbDinosaur on Fri Apr 04, 2014 9:10 pm, edited 1 time in total.

Top
 Profile  
Reply with quote  
PostPosted: Fri Apr 04, 2014 6:24 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10977
Location: England
Fair point, using a CMOS part would surely make life a lot easier. I have a gut feeling that Acorn's approach (which did have to work with the NMOS part) is over-complex, and yet it's quite beyond me to come up with anything better.


Top
 Profile  
Reply with quote  
PostPosted: Fri Apr 04, 2014 7:20 pm 
Offline
User avatar

Joined: Sun Oct 13, 2013 2:58 pm
Posts: 491
Location: Switzerland
Once upon a time when I run into the problem that memory was fast but IO and ROM slow I was thinking of the following solution, but never have implemented it. Excuse the format, but this is a copy of my old "notes" made in a normal Text editor.

Code:
11.1. New Clock
---------------------------------------------------------------------------

The new clock implements a state machine which allows to
dynamically switch between a Full and Half-frequency PHI2 cycle


The state machine has 2 bits, PHI2 and PHI3 which implement a
state machine with 4 states, it will use the already existing
74AC74.

The statemachine requires a input to select between the
fast and the slow cycle. There are 2 versions of the state
machine depending whether the signal is Low or High for
fast cycles

Version A: /SLOW is asserted (LOW) for slow cycles


            74F153         74F74


            +-----+         +-----+
   /SLOW----|I0   |  +------|     |--+--- PHI2
       VCC--|I1  Y|--+      | ST2 |  |
       VCC--|I2   |       +-|>    |  |
       GND--|I3  E|o-+    | |     |  |
            |A   B|  |    | |     |o----- /PHI2
            +-----+  =    | +-----+  |
             |   |        |          |
             +---|-------------------+
             |   |        |
             |   +-------------------+
             |   |        |          |
            +-----+       | +-----+  |
       VCC--|I0   |  +------|     |--+
       VCC--|I1  Y|--+    | | ST3 |
       GND--|I2   |       +-|>    |
       GND--|I3  E|o-+    | |     |
            |     |  |    | |     |o--
            +-----+  =    | +-----+
                          |
          ----------------+



ST2   0   1   0   0   1   1   0
ST3   0   1   0   1   0   1   0
State   0 --->   3 --->   0 --->   2 --->   1 --->   3 --->   0

The change from 0 depends on the input signal


      ST3   ST2
0 --->   3   1   1   Fast
0 --->   2   1   0   Slow

The following state changes are fixed

      ST3   ST2
1 --->   3   1   1
2 --->   1   0   1
3 --->   0   0   0

As we can see ST3 always toggles so we can also use /ST3
as input to D3, D2 has a more complex logic

D2   = /ST3   * /ST2   * /SLOW
   + /ST3   *  ST2
   +  ST3   * /ST2


Discussion of the delay times

There is a critical path regarding the decision of fast and slow
cycles. The following times are relevant

Name   Time   Description
tas   30ns   Address setup time of the CPU (W65C816)
tpd    7ns   Propagation Delay of Bank address latch (74F573)
tpd    7ns   Propagation Delay of the decoding of /SLOW respectively /FAST
tpd   10ns   Propagation Delay of MUX (74F157)
tsu    3ns   Setup Time of State Flip-Flops (74F74)
   ----
Total   57ns

thus this logic limits the Fast Clock to 8.77MHz. In order to speed
up the logic the easiest is to put the decoding, state selection
and the flip-flops into a fast CPLD

Name   Time   Description
tas   30ns   Address setup time of the CPU (W65C816)
tpd    7ns   Propagation Delay of Bank address latch (74F573)
tsu    7ns   Setup Time of CPLD
   ----
Total   44ns

this limits the Fast Clock to 11.36MHz.

The CPLD has the advantage to reduce the number of required chips and can
as well create other selection signals.

Or use a dedicated CPLD and integrate Flipflops

Name   Time   Description
tas   30ns   Address setup time of the CPU (W65C816)
tsu    5ns   Setup Time of CPLD
   ----
Total   35ns

which would support the full speed 14MHz Clock of the W65C816.


Top
 Profile  
Reply with quote  
PostPosted: Fri Apr 04, 2014 9:14 pm 
Offline
User avatar

Joined: Thu May 28, 2009 9:46 pm
Posts: 8485
Location: Midwestern USA
BigEd wrote:
Fair point, using a CMOS part would surely make life a lot easier. I have a gut feeling that Acorn's approach (which did have to work with the NMOS part) is over-complex, and yet it's quite beyond me to come up with anything better.

The problem for Acorn, I think, was two-fold: they had to effectively change the Ø2 frequency without inadvertent phase shift or level glitches, and they had to do it with the available logic of the day. Had they had, say, GALs available at the time it would have resulted in a lot less circuitry.

_________________
x86?  We ain't got no x86.  We don't NEED no stinking x86!


Top
 Profile  
Reply with quote  
PostPosted: Fri Apr 04, 2014 9:26 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10977
Location: England
Well, there are already 2 pretty complex ULAs on the board - not sure why but there are still quite a number of 74 series parts too. (See http://www.8bs.com/inbbc.htm)


Top
 Profile  
Reply with quote  
PostPosted: Fri Apr 04, 2014 9:35 pm 
Offline
User avatar

Joined: Fri Aug 30, 2002 1:09 am
Posts: 8540
Location: Southern California
We've had a few discussions about this over the years but I'm not finding it easy to round them up. I posted a circuit at viewtopic.php?f=4&t=1370&start=262 (in one of ElEctric_EyE's long topics) that uses a speed input bit to selectively skip every other low pulse, without glitches at the switching. Don't use the 4000-series logic I show there-- I used that only because I wanted slow-enough logic that if there were any glitches or runt pulses, my slow oscilloscope could catch them. The circuit uses a flip-flop and a quad NOR, which you would want to select from a faster family.

_________________
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?


Top
 Profile  
Reply with quote  
PostPosted: Fri Apr 04, 2014 9:40 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10977
Location: England
That's good -please could you attach it here? I did a little searching before posting, but didn't find that one.

Edit: Hmm, it only slows one phase, so it might not work with peripherals that use both edges of the clock...


Top
 Profile  
Reply with quote  
PostPosted: Fri Apr 04, 2014 9:54 pm 
Offline
User avatar

Joined: Thu May 28, 2009 9:46 pm
Posts: 8485
Location: Midwestern USA
BigEd wrote:
Well, there are already 2 pretty complex ULAs on the board - not sure why but there are still quite a number of 74 series parts too. (See http://www.8bs.com/inbbc.htm)

From today's point of view, it's amazing how much hardware was required back then to create a reasonable computer. You could probably fit six of my POC units in the same area taken up by the Acorn's motherboard. Of course, the Acorn had color video and sound...but even so, it took a lot of discrete silicon to achieve it.

_________________
x86?  We ain't got no x86.  We don't NEED no stinking x86!


Top
 Profile  
Reply with quote  
PostPosted: Fri Apr 04, 2014 10:04 pm 
Offline
User avatar

Joined: Fri Aug 30, 2002 1:09 am
Posts: 8540
Location: Southern California
BigEd wrote:
That's good -please could you attach it here? I did a little searching before posting, but didn't find that one.

Image

Quote:
Edit: Hmm, it only slows one phase, so it might not work with peripherals that use both edges of the clock...

You might be able to get the timing accurate enough by adding a non-inverting gate of some kind that gets its input at the same place the top-right NOR does (which is only being used as an inverter anyway), or by using an OR gate getting its inputs at the same place the second-to-last NOR does, namely the clock source and the flipflop's Q output. The extra gate could be of a different 74 family if necessary to try to match timings.

_________________
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?


Top
 Profile  
Reply with quote  
PostPosted: Sat Apr 05, 2014 1:59 am 
Offline

Joined: Mon Mar 02, 2009 7:27 pm
Posts: 3258
Location: NC, USA
Check out the Maxim IC DS1085L. Was for 3.3V operation and SPI programming, but frequency changing was glitch free. Also I think they had different operating voltage options.

_________________
65Org16:https://github.com/ElEctric-EyE/verilog-6502


Top
 Profile  
Reply with quote  
PostPosted: Sat Apr 05, 2014 10:01 am 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10977
Location: England
One thing to note about Acorn's solution: they run the VIAs with a free-running 1MHz clock, so the timer driven interrupts have a solid timebase. This presents an extra problem, because this free-running clock can be in one of two states at the point that a slow cycle is needed.


Top
 Profile  
Reply with quote  
PostPosted: Sat Apr 05, 2014 11:58 am 
Offline
User avatar

Joined: Sun Dec 29, 2002 8:56 pm
Posts: 460
Location: Canada
A little OT but one could build a bus bridge to interface to the slower peripherals rather than clock stretching. It seems to me the way to do
the interfacing is by controlling the RDY signal to the processor.

For write cycles the data to the peripheral would need to be latched, and a small state-machine used to generate a write strobe to the peripheral.
A write fifo could also be used. The processor would write to the fifo which would then trigger a write to the actual peripheral. One would have to
be careful with the software not to perform consecutive writes too quickly to the same peripheral. OR the state-machine could generate an
appropriate RDY signal.

For read cycles a read data latch could be used along with a state-machine used to drive the RDY line to the processor.
OR two read cycles could be used. The first cycle triggers a read of the peripheral into latches. Then the second read cycle reads the latches
at full speed. But this solution is visible to the software.

There are bi-directional bus latches ('646's?) that would help.

_________________
http://www.finitron.ca


Top
 Profile  
Reply with quote  
PostPosted: Sat Apr 05, 2014 12:19 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10977
Location: England
Yes - when the bus takes over the job of seeing that the writes reach their destination, the fact that RDY doesn't stall writes (in the NMOS part) becomes a non-issue. Unfortunately, you can see as many as three back to back writes, if you look at all cases. But for normal cases of access to peripherals, you'll only see a single write and you won't be able to read back from that address until after at least one read.


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 14 posts ] 

All times are UTC


Who is online

Users browsing this forum: No registered users and 25 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: