6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Sat Nov 16, 2024 1:35 am

All times are UTC




Post new topic Reply to topic  [ 10 posts ] 
Author Message
 Post subject: Backplane Design...
PostPosted: Thu Jul 14, 2005 3:20 pm 
Offline

Joined: Sat Jan 04, 2003 10:03 pm
Posts: 1706
Well, now that the Kestrel 1 appears to be a "done deal," I'm looking forward to the future with the Kestrel 2-series. I would like to do the Kestrel 2 "as right as possible," and that means, I need some semblence of a reasonable standard expansion backplane for it.

I've been looking at the VME bus connectors, and I have found some that were in the $6 range at DigiKey once -- which actually isn't a bad price considering they have 96 pins on them! (DigiKey part A1262-ND costs $4.52 and part 478-1950-ND costs $3.28 each). Seems like the prices actually dropped since I remembered looking for them last. These prices seem quite reasonable to me either way.

So, now that I seem to have the problem of backplane *connector* figured out, the next problem is the bus cycle. As you're probably aware, I have that little circuit that I arranged that turned the 65816's bus into a 68000-like asynchronous bus. Well, it has a critical flaw -- what happens if the bus master makes two or more back-to-back references to the same chip/circuit? (For example, when the 65816 makes a single 16-bit write to a device.) The asynchronous bus interface cannot handle that, and therefore, will generate wait-states only on the first access; subsequent accesses will produce zero wait states.

Therefore, I'm doing some research on alternative methods of handling speed-insensitive methods of properly generating the RDY signal for the CPU. I know that this rarely comes up when designing circuits around the 6502/65816 because most people drive them in the single-digit MHz range, but maybe a few of you here have had experience with alternative circuit designs where this was an issue. What were your experiences?

The reason I'm asking, is because I'm thinking of making one row of contacts a large subset of the same contacts used on the VIA chips (thus making it absolutely trivial to add VIAs to the circuit with the smallest possible board size). The VIA itself is a synchronous bus interface, that requires a CPU clock, ideally.

But, let's *pretend* that we have a CPU running at 6MHz, but we need a VIA that is running at 1MHz rate, so that we can achieve some precise, desired amount of timing. What approaches would you folks take to achieve this goal?

Thanks.


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Thu Jul 14, 2005 11:38 pm 
Offline
User avatar

Joined: Fri Aug 30, 2002 1:09 am
Posts: 8541
Location: Southern California
If you're talking about the processor actually pausing to give more time for slow parts on buffered buses, how about doing that in the clock signal feeding the whole system including the processor? IOW, when the conditions require longer phase-2-high or -low times, you could accomplish that in the clock logic, and the processor doesn't even have to know that the clock speed isn't a consistent XX-MHz square wave. I've thought about doing something like that, and then for timing like for a software RTC, use a VIA counter clocked off a PB6 input instead of the phase 2 which would no longer be a consistent speed. PB6 would be fed by a separate signal source that does not depend on the system clock speed. Alternately, an already-divided-down seperate clock could be fed to a CA or CB pin. That also allows increasing the clock speed later without having to change certain things in software. (Most early microcomputer designs were short-sighted in this respect.) There are other challenges it does not address however. Some 65xx parts, for example, need a continuing phase-2 signal that is within their operating frequency range in order to do certain things internally between bus reads and writes.

This all brings me back to some of the reasons why, for my next computer (which has been on the drawing board for years now, evolving slowly, but not becoming reality so far), I decided not to bring any bus signals-- even buffered-- out to a backplane. Instead, the backplane signals would mostly be from VIA I/O pins on the CPU board. From there, one could connect data converters, latches, intelligent LCD modules, all kinds of synchronous-serial-bus parts (which are a whole lot easier to wire up), etc. etc.-- almost anything you could want, except that you can't make fast memory transfers for something like streaming video. You could still construct a whole card cage of equipment for running a factory if you wanted to. Cards that need to do something fast would probably have their own intelligence onboard, and not need high-speed babysitting by the processor.


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Fri Jul 15, 2005 4:35 am 
Offline

Joined: Sat Jan 04, 2003 10:03 pm
Posts: 1706
GARTHWILSON wrote:
If you're talking about the processor actually pausing to give more time for slow parts on buffered buses, how about doing that in the clock signal feeding the whole system including the processor?


I've thought of this solution, but I'm not too happy with it, as it's not guaranteed to work across all microprocessors. E.g., if I also want to slap in a 68000 or even a homebrew processor integrated into an FPGA, I can't guarantee success with this.

Quote:
This all brings me back to some of the reasons why, for my next computer (which has been on the drawing board for years now, evolving slowly, but not becoming reality so far), I decided not to bring any bus signals-- even buffered-- out to a backplane.


The reason I'm not particularly happy with this solution is that it doesn't give me the ultimate freedom to make any kind of card I want. For example, I could make an IEEE-488 bus using DB25 connectors for cards to plug into. IEEE-488 allows for pretty zippy data transfers in a pure hardware-oriented interface. But, there are several problems with it:

* EVERY plug-in card requires local intelligence of some kind. Not every expansion device requires local intelligence. For example, it's highly unlikely that a programmer for a PIC or AVR microcontroller, or even an EEPROM, requires local intelligence.

* Bit-banged I/O is slow, plain and simple. This throws away any hope at all of me making a video card for the computer. Yeah, I suppose I could just build a video card with local intelligence, and have it pass graphics via the GPIB bus, BUT, doing so is grievously slow. You can kiss any hope of full 60fps animation out the window with that.

* Bit-banged I/O requires the aid of the computer's CPU. Unless I use a dedicated microcontroller just for the expansion backplane, there is no hope of me getting any kind of reasonable computation speed if it's constantly babysitting the bus. For example, let's pretend that we can drive our GPIB at at least 12.6MBps throughput somehow. That means, for a 320x480 display, I'd be using a solid 75% of the CPU's resources just to manually blast data to the screen.

* If I do use a dedicated microcontroller to control access over the I/O bus, I require some means of interfacing the host processor to it. This will be like using the USB host controllers, where you never can address a device by address and just twiddle some bits. No, you have to send a *packet* to the device asking it for some status, make your adjustment, then send a packet back to the device with the updated value. That's a lot of I/O overhead for something a 6502 can do in a single RMW cycle.

Quote:
make fast memory transfers for something like streaming video. You could still construct a whole card cage of equipment for running a factory


My goal isn't to run a factory. My goal is to make a usable desktop computer.

Quote:
own intelligence onboard, and not need high-speed babysitting by the processor.


They still would need fast interconnects to the processor. Nothing is more aggrevating to a user than an 8MHz microprocessor driving the video display at only four frames per second. Even a Macintosh Classic could do a whole lot better than that.


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Fri Jul 15, 2005 6:54 am 
Offline
User avatar

Joined: Fri Aug 30, 2002 1:09 am
Posts: 8541
Location: Southern California
The 96-pin DINs (three rows of 32, on .100" centers) certainly have enough pins to implement several different interface types on the same backplane. If video is the only thing that needs to be fast, maybe that could go on the CPU board if your boards are at least 100x160mm eurocards like 3U VME. 220mm is a less-standard 3U length. 6U VME boards are usually 233x160mm with two 96-pin connectors. STD boards are 4.5 by about 6" not including the board-edge connector. STD 80 gives 56 connections, while STD32 gives 130 IIRC. I mention these bus standards because you can get breadboards and card cages already made for them. New card cages are always expensive, but you can find suitable things at the electronics swap meets if you don't want to start from scratch and make your own. I got a nice, half-rack-width 3U VME cage for $10 at an electronics swap meet.

A couple of other ways to handle the video is a mezzanine board on the CPU board, or maybe you could have the CPU board and video boards be separate but use one of the high-speed serializer/de-serializer ICs and just put a short coax or cat-5 cable between them instead of using the slower backplane. If the two boards can always be in adjacent slots, you could even just use a parallel interface and a 2" ribbon cable between them.

What you're saying about speeds however is the same thing I had to consider-- whether to slow the processor down for other parts, or use slowish interfaces so the processor can run full speed without waiting for anything, and use simpler hardware. I came to the conclusion, at least for my type of work (although video is a non-priority for me) that overall performance would be better with the latter-- having the processor run at full speed all the time.


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Fri Jul 15, 2005 11:41 am 
Offline

Joined: Tue Mar 09, 2004 3:43 pm
Posts: 44
Location: Bristol, UK
For a really simple bus, how about the S50 bus that was used in many SWTPC 6800 and 6809 computers in the early 1980s? Just wish I could find a reference to it on the web! There's an SWTPC web site here:

http://www.swtpc.com/mholley/index.html

Ah! got it:

http://retro.co.za/6809/documents/ct-apr81.pdf

Some sites call it the SS50 bus. And the I/O bus is even smaller, S30. That is, only 30 pins. You'd probably want to change all those low-baud-rate clocks, though.,


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Fri Jul 15, 2005 2:29 pm 
Offline

Joined: Sat Jan 04, 2003 10:03 pm
Posts: 1706
coredump wrote:
For a really simple bus, how about the S50 bus that was used in many SWTPC 6800 and 6809 computers in the early 1980s? Just wish I could find a reference to it on the web! There's an SWTPC web site here:

http://www.swtpc.com/mholley/index.html

Ah! got it:

http://retro.co.za/6809/documents/ct-apr81.pdf

Some sites call it the SS50 bus. And the I/O bus is even smaller, S30. That is, only 30 pins. You'd probably want to change all those low-baud-rate clocks, though.,


Well, I think that is heading in disinctly the wrong direction from what I'm looking for.

See, here's the deal: if I can find a cheap and simple circuit that accurately generates RDY pulses that are synchronized against ph2, but yet independent of ph2 itself otherwise, then I think I stand a good chance of making the backplane work.

Another approach is to take the Ethernet-like and PCI-like philosophy: have a bus clock that is synchronous. Slow cards get 4MHz clocks. Fast cards get 16MHz. Faster still cards might bet something like 33MHz, or 64MHz. A set of select pins on the bus card chooses the bus speed the card is engineered for.

That way, slow devices can generate RDY without having to worry what the current CPU speed is.

Of course, now, the CPU has to be synchronized somehow against the main bus speed. That means, if I have a 6MHz CPU (for example, 1/4 the VGA dotclock is 6.3MHz IIRC), my life will be a living hell.

Still another approach is to just put an upper limit on the clock speed, and then just have the clock switch to the clock of whatever bus master is present. The generation of RDY then becomes much more like how a 68000-based system would generate DTACK.

But, there again, is this a safe thing to do? Especially in the presence of processors that are potentially dynamic logic?


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Mon Sep 12, 2005 3:26 am 
Offline

Joined: Wed Jul 20, 2005 11:08 pm
Posts: 53
Location: Hawaii
kc5tja wrote:
* Bit-banged I/O requires the aid of the computer's CPU. Unless I use a dedicated microcontroller just for the expansion backplane, there is no hope of me getting any kind of reasonable computation speed if it's constantly babysitting the bus. For example, let's pretend that we can drive our GPIB at at least 12.6MBps throughput somehow. That means, for a 320x480 display, I'd be using a solid 75% of the CPU's resources just to manually blast data to the screen.

Why not have a PIC babysit the expansion bus, hook up a VIA to the PIC, and instead of having the 6502 do the bit-banging, tell the PIC to do the bit-banging, and have the PIC programmed to send the 6502 an interrupt when it is done? Better yet, use the 'Interrupt-Serviced 256-byte Buffer' in the 6502.org software page.

An alternative would be to use a MOS 6502c (4 MHz), but take advantage of MOS's testing process by overclocking it at 8 MHz. With adequate cooling, you might even be able to run it faster then the 816. Then, use a COP instruction to interface with the 6502c. The only thing to watch out for is synchronization - during I/O to the 6502c, you should synchronize the clock speed between the two.

_________________
Sam

---
"OK, let's see, A0 on the 6502 goes to the ROM. Now where was that reset vector?"


Top
 Profile  
Reply with quote  
 Post subject: Re: Backplane Design...
PostPosted: Mon Sep 12, 2005 10:31 am 
Offline
User avatar

Joined: Fri Dec 12, 2003 7:22 am
Posts: 259
Location: Heerlen, NL
kc5tja wrote:
But, let's *pretend* that we have a CPU running at 6MHz, but we need a VIA that is running at 1MHz rate, so that we can achieve some precise, desired amount of timing. What approaches would you folks take to achieve this goal?


I'm thinking about stretching the clock. The moment a circuit detects that a slow device is addressed, the clock is stretched until a rise and a fall of the clock of the slow device is detected.
Because the 65816 will read the data on the falling edge of its own clock and this can be 'much' later then the one of the slow device, we need to latch the data. IMHO two 74ALS573's (or equivalents) will do. One will buffer the data towards the slow device, the other towards the 65816. The clock of the slow device clocks the data into that last 573.

I hope this idea helps a bit :)

_________________
Code:
    ___
   / __|__
  / /  |_/     Groetjes, Ruud
  \ \__|_\
   \___|       URL: www.baltissen.org



Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Tue Sep 13, 2005 6:37 pm 
Offline

Joined: Sat Jan 04, 2003 10:03 pm
Posts: 1706
asmlang_6 wrote:
Why not have a PIC babysit the expansion bus, hook up a VIA to the PIC, and instead of having the 6502 do the bit-banging, tell the PIC to do the bit-banging, and have the PIC programmed to send the 6502 an interrupt when it is done? Better yet, use the 'Interrupt-Serviced 256-byte Buffer' in the 6502.org software page.


I already explained my rationale above, but over time, I've come to the conclusion that a high-speed network fabric, where data is passed on a backplane at least at 32Mbps (4MHz @ 8 bits/symbol). This keeps all the core logic local to each device, although it still does require each device to have local intelligence.

Quote:
An alternative would be to use a MOS 6502c (4 MHz), but take advantage of MOS's testing process by overclocking it at 8 MHz. With adequate cooling, you might even be able to run it faster then the 816.


They both run at the same speeds. A 14MHz 65816 can, without heatsinking, run up to 20MHz (the SuperCPU does this for the Commodore 64, for example). With heatsinking and/or active cooling, it might be possible to clock it higher still.


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Tue Sep 13, 2005 6:48 pm 
Offline

Joined: Sat Jan 04, 2003 10:03 pm
Posts: 1706
Stretching the clock is something that I've considered, and personally felt "dirty" about. I never liked the idea of tinkering with a clock.

BUT...if one uses the CPU's clock *exclusively* for interacting with peripherals, and not for peripheral timing, then I suppose it can work. As Garth wrote above (which for some reason I didn't read the first time), use an external reference clock for devices. Therefore, the state of the address bus determines the length of the low/high cycle times.

Generalizing this out a bit, the address decoder can produce a few ENABLE outputs as it traditionally does. Then the enabled peripheral will be responsible for producing its own ph2 clock. Therefore, each peripheral card will require its own clock, OR, buffer/manipulate a system-supplied standard clock. For example, if a device like a 5MHz VIA chip were addressed, it could take the "system's 16MHz reference" clock and divide it by four.

In fact, it's best to NOT think of ph2 as a clock; instead, it is best to think of it simply as a "Transfer Data Enable" signal. The 6502/65816 just so happens to be "asynchronous"-enough to run exclusively on this signal.

The only problem with doing this, of course, is that you now need to phase-match the generated clock with what the CPU currently expects. This is relatively easy to do, I think, but it's a detail that must not be missed. :)

There is an isomorphism with an asynchronous bus interface -- with an async interface, there are no clocks exchanged between the host and the peripheral anyway, so the peripheral must provide its own clocking.

OK, I'm starting to like this idea.

With respect to other processors, I strongly doubt they'll appreciate having their clocks adjusted on a cycle-by-cycle basis like this. A 68000-based CPU can be self-clocked, and a logic block would provide an interface between VPA/VDA/PHASE to AS/DS*/DTACK. Not sure how a fully synchronous processor like an ARM or embedded PowerPC would fit in though.

hmm....definitely food to think about. Looking at this problem orthogonally definitely gave me a few ideas...


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 10 posts ] 

All times are UTC


Who is online

Users browsing this forum: No registered users and 6 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: