Massively parallel 6502 systems

Let's talk about anything related to the 6502 microprocessor.
Post Reply
User avatar
BigEd
Posts: 11464
Joined: 11 Dec 2008
Location: England
Contact:

Massively parallel 6502 systems

Post by BigEd »

Nearby, in the newbies forum, an off-topic excursion concerning using multiple 6502s to implement a complex graphics subsystem:
railsrust wrote:
So I got this email from "my little helper" yesterday:
Quote:
...what would you think of 4 or more 6502s, closely coupled but yet able to execute independently? If we think of the theme of throwing processors at the problem ...
...
Heh, and a bargraph led display that acts like a speedometer the more 6502s you use.

I know people have done multiprocessor ’02’s before I wonder if there is anything in open domain, namely the task dispatcher and manager. They also have done a bunch of 8051s which are mores selfcontained.

I gotta believe one FPGA can manage a crapload of ‘02s.
Anyone know of a way to manage multiple 6502s like this?
GARTHWILSON wrote:
railsrust wrote:
Anyone know of a way to manage multiple 6502s like this?
It's getting off-topic, but that's ok. It's your own topic. :)

Here are some earlier topics that are very relevant, and kc5tja made valuable contributions on: WDC's W65C02S adds some more signals at the pins, namely ML\ (memory lock not) output (pin 5 on a DIP), BE (bus enable) input (pin 36 on a DIP), and VP\ (vector pull not) output (pin 1 on a DIP).
I have a couple more links to add to Garth's list of previous topics: but I'll very much second the idea noted in the initial quote: it's the software, the management of tasks and of data transfer, which is the major challenge here.
whartung
Posts: 1004
Joined: 13 Dec 2003

Re: Massively parallel 6502 systems

Post by whartung »

BigEd wrote:
but I'll very much second the idea noted in the initial quote: it's the software, the management of tasks and of data transfer, which is the major challenge here.
Depends on how tightly the CPUs are integrated. If they're essentially stand alone machines (w independent CPU/ROM/RAM) connected through some networking, then, yea, it's mostly a software issue.

But if they're sharing RAM and/or other devices, where handshaking is done at a hardware level, then it's a different animal.

Folks are struggling getting the 65816 address decoded properly.

Just having several CPUs fighting for a common bus can be an issue.
User avatar
Druzyek
Posts: 367
Joined: 12 May 2014
Contact:

Re: Massively parallel 6502 systems

Post by Druzyek »

I have thought about trying to run two CPUs through one CPLD. When the clock goes low and you are waiting 30ns for the processor's address lines to settle, you could have the CPLD switch to the address lines of a second processor that has already settled and let that access memory while you wait. I don't think you could run them both at full speed but you would at least be doing something useful during that 30ns.
User avatar
drogon
Posts: 1671
Joined: 14 Feb 2018
Location: Scotland
Contact:

Re: Massively parallel 6502 systems

Post by drogon »

Druzyek wrote:
I have thought about trying to run two CPUs through one CPLD. When the clock goes low and you are waiting 30ns for the processor's address lines to settle, you could have the CPLD switch to the address lines of a second processor that has already settled and let that access memory while you wait. I don't think you could run them both at full speed but you would at least be doing something useful during that 30ns.
Not sure why you even need a CPLD for just 2 x 6502's into a common memory system - after all, this is how video is done on some of the older systems - Apple II, etc. 6502 accesses RAM on one half cycle, video on the other - one reason the clock crystals seemed a bit weird then. (Exact multiples of NTSC or PAL scan frequency)

If you ran both 6502's off the same Ph2 clock, but one inverted, then we know that the 6502 only uses (less than) half a cycle to access RAM/ROM, so that leaves the other half cycle for the other processor.

Some glue might be needed to toggle the BE pins (65C02) appropriately, and deal with R/W which I don't think is tri-stated with BE.

the '816 has the added complication of the upper 8-bits of address being latched on the "dead" half cycle, so you might need a separate tri-state buffer on the output of that latch.

Other than that... I am being too naive?

-Gordon
--
Gordon Henderson.
See my Ruby 6502 and 65816 SBC projects here: https://projects.drogon.net/ruby/
User avatar
GARTHWILSON
Forum Moderator
Posts: 8775
Joined: 30 Aug 2002
Location: Southern California
Contact:

Re: Massively parallel 6502 systems

Post by GARTHWILSON »

whartung wrote:
BigEd wrote:
but I'll very much second the idea noted in the initial quote: it's the software, the management of tasks and of data transfer, which is the major challenge here.
Depends on how tightly the CPUs are integrated. If they're essentially stand alone machines (w independent CPU/ROM/RAM) connected through some networking, then, yea, it's mostly a software issue.

But if they're sharing RAM and/or other devices, where handshaking is done at a hardware level, then it's a different animal.
I'm not concerned with the handshaking so much as the administration job of deciding how to distribute the work load to keep the various processors busy in a way that's productive overall.

Thanks for the additional links, Ed.
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?
User avatar
GaBuZoMeu
Posts: 660
Joined: 01 Mar 2017
Location: North-Germany

Re: Massively parallel 6502 systems

Post by GaBuZoMeu »

whartung wrote:
Just having several CPUs fighting for a common bus can be an issue.
Depends on how you try to attempt that and what you are willing to pay :)

Assume you really wish to use one (extraordinary fast) memory to serve as a common RAM for n CPUs. Assume further you arrange a clock generator with n outputs each output delayed by t ns where t is the cycle time of the common RAM. You then have to mux each CPUs bus to that RAM put/fetch_and_latch a byte and then serve the next CPU. Really challenging I think and even with say 7 ns RAM and virtual no delay from the muxes only 10 CPUs (yielding a total cycle time of 70 ns = 14 MHz) could interact this way.

But most likely only a fraction of the RAM need to be "common", e.g. one KB or two. You could then use dual port RAM and use the "other" side to synchronize all DPRAMs. A 6502 can only write each fourth cycle (three if zero page but that would make less sense), so even at 14MHz clock only each 4x 70ns = 280ns a byte could be issued by one CPU. The "other" side of the DPRAM could be operated by some logic at full speed e.g. 14 ns. This logic could select one DPRAM to deliver its contents while all other DPRAMs are simultaneously written with that data. With no further delays 280/14=20 quasi simultaneous writes could be served. Here the problem is to fetch all CPU side accesses and queue them up for processing on the "other" side. :shock: Again challenging I assume :)

Using a FPGA with its block RAMs inside might be easier. Inside the FPGA you can operate even faster. On the other hand: for each 6502 there are 20 pins (10 AB, 8 DB, PHI2, RWB) required....

Me thinks a couple of loosely coupled autonomous computers exchanging highly condensed information in a low frequency occurrence are much much easier to manage. A W65C265S with its four serial ports could act like a poor man's Transputer.

my 2 cents :)
User avatar
Dr Jefyll
Posts: 3526
Joined: 11 Dec 2009
Location: Ontario, Canada
Contact:

Re: Massively parallel 6502 systems

Post by Dr Jefyll »

drogon wrote:
Not sure why you even need a CPLD for just 2 x 6502's into a common memory system - after all, this is how video is done on some of the older systems - Apple II, etc. 6502 accesses RAM on one half cycle, video on the other [...] Some glue might be needed to toggle the BE pins
Slightly OT, since 2 processors doesn't qualify as massively parallel, but since the subject was mentioned here is a vintage homebrew using two 6809's in that fashion. The two CPU's take turns accessing a shared RAM. And one of the CPU's is also the video system. This project predates CPLD's (although I did exploit programmable logic in the form of 32 x 8 TTL PROM's).

Because DRAM requires multiplexers anyway, this design doesn't do the trick of tying the two CPU address buses together and toggling the BE pins.

Using static RAM you *could* tie the buses together that way (ie, omit the multiplexer). Having each CPU tristated for the first half of every cycle won't affect the RAM because the RAM is fast enough to do the entire access in the remaining half of the cycle. But tristating for the first half of every cycle means extra delay before the address decoder for memory-mapped I/O can begin doing its job, and that might force a reduction in clock speed (as Druzyek mentioned).

-- Jeff
In 1988 my 65C02 got six new registers and 44 new full-speed instructions!
https://laughtonelectronics.com/Arcana/ ... mmary.html
Martin A
Posts: 197
Joined: 02 Jan 2016

Re: Massively parallel 6502 systems

Post by Martin A »

One possible way to synchronise multiple CPU's is alter the clock ratio.

I've done a quick and dirty test on a breadboard. It's got a 25.175mhz can oscillator (as that's what I had!) driving a 74HC163 4 bit synchronous counter.

The B C and D outputs from the counter feed the A B and C inputs of a permanently enabled 74HC138 3 to 8 decoder to produce 8 non overlapping clocks with a 7:1 high to low ratio. Sending the Y0 to Y7 outputs from the decoder through a 74HC240 inverts the signals to produce 8 "CPU clocks".

If the same Y0-Y7 outputs controlled the enable pins for set of 74HC244 buffers for each CPU then they would all only be connected to the target memory for 1/8 of the time but appear to have full access.

The test board produced a high time of just under 80ns which is comfortably more than the access time for modern SRAM. A 32mhz master clock would have produced 2mhz CPU clocks and 60ns access periods.

The question is then whether 8 CPU's in the 1-2mhz range is a worthwhile goal.
Attachments
Two CPU clocks
Two CPU clocks
25.175 MHz master clock, 7:1 ration CPU clock at 1.57mhz
25.175 MHz master clock, 7:1 ration CPU clock at 1.57mhz
Can Oscillator, 74HC163, 74HC138,74HC240
Can Oscillator, 74HC163, 74HC138,74HC240
Post Reply