Parallel Processing with 6502s
Parallel Processing with 6502s
Is it possible to do parallel processing with 6502s?
- GARTHWILSON
- Forum Moderator
- Posts: 8775
- Joined: 30 Aug 2002
- Location: Southern California
- Contact:
Alright, since nobody else has jumped in--
Sure. It seems like we've talked about that on the forum, but I can't find it now.
One way to do it would be to use dual-port memory for two processors. I've never heard of memory with more than two ports though if you wanted to have three or more processors using it.
There has been some talk of having two processors running on oposite phases of the system clock so they can each access things on the bus in turn, similar to how the Apple II did the video. (The processor accessed memory during the usual phase-2-high time and the video could access the same memory during the phase-2-low time, so both could access the same memory seemingly at the same time without slowing down and without using dual-port memory.) Actually you could have multiple non-overlapping clocks, each with the duty cycle being a little less than the inverse of the number of processors you wanted to have sharing the same memory. Obviously there's a speed penalty though.
Of course one thing you can do is to offload some I/O jobs from the main processor onto microcontrollers that are fine for tiny jobs but can't handle the overall job. For example, if you had to do a ton of bit-banging to keep an SMBus serial interface busy with a 6502, it might be better to just interface to a PIC microcontroller and let the PIC babysit the SMBus interface while the 6502 does something else productive.
But if you really want several 6502's working in parallel, I think you'd be better off making separate computers, even if they're on the same board, and making some kind of miniature network to tie them together.
The one thing I don't know how to deal with is how to split up jobs between the various processors such that the "administrative" overhead and all the communicating between computers doesn't take up so much of the time that you lose the performance advantage you had hoped to gain in having parallel processors. I suppose it depends on what kind of work you want to do with the system. Maybe the administrative overhead would eat up a pretty small percentage of the gains if you want to do something like Fourier transforms on large arrays. Then each separate computer could take its job and go off in its corner and not bother the master computer for a long time. This is one subject I'd like to hear more of from someone experienced with multiprocessor programming. (Edit: Andre Fachat answered this in post #17 of this topic.)
Quote:
Is it possible to do parallel processing with 6502s?
Sure. It seems like we've talked about that on the forum, but I can't find it now.
One way to do it would be to use dual-port memory for two processors. I've never heard of memory with more than two ports though if you wanted to have three or more processors using it.
There has been some talk of having two processors running on oposite phases of the system clock so they can each access things on the bus in turn, similar to how the Apple II did the video. (The processor accessed memory during the usual phase-2-high time and the video could access the same memory during the phase-2-low time, so both could access the same memory seemingly at the same time without slowing down and without using dual-port memory.) Actually you could have multiple non-overlapping clocks, each with the duty cycle being a little less than the inverse of the number of processors you wanted to have sharing the same memory. Obviously there's a speed penalty though.
Of course one thing you can do is to offload some I/O jobs from the main processor onto microcontrollers that are fine for tiny jobs but can't handle the overall job. For example, if you had to do a ton of bit-banging to keep an SMBus serial interface busy with a 6502, it might be better to just interface to a PIC microcontroller and let the PIC babysit the SMBus interface while the 6502 does something else productive.
But if you really want several 6502's working in parallel, I think you'd be better off making separate computers, even if they're on the same board, and making some kind of miniature network to tie them together.
The one thing I don't know how to deal with is how to split up jobs between the various processors such that the "administrative" overhead and all the communicating between computers doesn't take up so much of the time that you lose the performance advantage you had hoped to gain in having parallel processors. I suppose it depends on what kind of work you want to do with the system. Maybe the administrative overhead would eat up a pretty small percentage of the gains if you want to do something like Fourier transforms on large arrays. Then each separate computer could take its job and go off in its corner and not bother the master computer for a long time. This is one subject I'd like to hear more of from someone experienced with multiprocessor programming. (Edit: Andre Fachat answered this in post #17 of this topic.)
Last edited by GARTHWILSON on Tue Nov 02, 2010 4:36 am, edited 1 time in total.
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?
I wrote some multi-processor code.. sort-of.
On an SNES program, I had 2 CPUs doing distinct parts of the same task. In this case, it was necessary. The main CPU (a 65816) was running music playing code that results in about 16 bytes of data (control data for a synth).
Then it would sync with the co-CPU that controls the sound in that system (it's very similar to a 6502, but with MOV instead LDx), transfer those bytes, and I had the co-CPU decode that data and translate it to the right format for the soundchip.
This way of doing I guess would be good if you're doing something that can be broken cleanly into distinct tasks like that. It's more like writing 2 seperate programs.. maybe not what you have in mind.
BTW, about simulating dual-port RAM, there've been some interesting posts about that on the nesdev forum:
http://nesdev.parodius.com/cgi-bin/wwwt ... b=#Post802
On an SNES program, I had 2 CPUs doing distinct parts of the same task. In this case, it was necessary. The main CPU (a 65816) was running music playing code that results in about 16 bytes of data (control data for a synth).
Then it would sync with the co-CPU that controls the sound in that system (it's very similar to a 6502, but with MOV instead LDx), transfer those bytes, and I had the co-CPU decode that data and translate it to the right format for the soundchip.
This way of doing I guess would be good if you're doing something that can be broken cleanly into distinct tasks like that. It's more like writing 2 seperate programs.. maybe not what you have in mind.
BTW, about simulating dual-port RAM, there've been some interesting posts about that on the nesdev forum:
http://nesdev.parodius.com/cgi-bin/wwwt ... b=#Post802
-
Nightmaretony
- In Memoriam
- Posts: 618
- Joined: 27 Jun 2003
- Location: Meadowbrook
- Contact:
Common memory shared with multipel processors sort of urinates me off as I have to deal with this arcade game called Galaga whihc uses 3 seperate Z80s tied into a single 2K RAM memory which is a common memory for both.
The system is amazingly touchy and likes to use capacitor delays for critical RAM timing. I designed an adapter to help out with the one issue, but overall, I think of it as a fairly lousy design.
If one were to do multiple CPUs, I WOULD like to esee each cPU have its own RAM and RO:M system for their tasks, and the coomon RAM to be used to commands and data rather than critical zero page and internal variables for the CPUs. This way also, if any of the processors drop out, it doesnt drive the entire system crazy. You can put in code to handle such events.
The system is amazingly touchy and likes to use capacitor delays for critical RAM timing. I designed an adapter to help out with the one issue, but overall, I think of it as a fairly lousy design.
If one were to do multiple CPUs, I WOULD like to esee each cPU have its own RAM and RO:M system for their tasks, and the coomon RAM to be used to commands and data rather than critical zero page and internal variables for the CPUs. This way also, if any of the processors drop out, it doesnt drive the entire system crazy. You can put in code to handle such events.
"My biggest dream in life? Building black plywood Habitrails"
-
Lyos Gemini Norezel
- Posts: 54
- Joined: 28 Dec 2003
Hey ya'll... I had a thought on this... could the 6522's be used as the interface for multiple cpu's? Like just jerryrigging a connection between a two or more 6522's and it would almost be like connecting two computer via a serial port or the like right?
Lyos Gemini Norezel
Lyos Gemini Norezel
Mundus Vult Decipi et Decipiatur
- GARTHWILSON
- Forum Moderator
- Posts: 8775
- Joined: 30 Aug 2002
- Location: Southern California
- Contact:
Right, but it's not necessarily jerryrigging at all.
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?
-
Lyos Gemini Norezel
- Posts: 54
- Joined: 28 Dec 2003
Lyos Gemini Norezel wrote:
Hey ya'll... I had a thought on this... could the 6522's be used as the interface for multiple cpu's? Like just jerryrigging a connection between a two or more 6522's and it would almost be like connecting two computer via a serial port or the like right?
Lyos Gemini Norezel
Lyos Gemini Norezel
--
Samuel A. Falvo II
Hallo Dave,
> Anybody know if these saw the light of day?
The old IEEE Commodore drives used either one 6502 and one 6504 or two 6502's, running in the way you described.
> Anybody know if these saw the light of day?
The old IEEE Commodore drives used either one 6502 and one 6504 or two 6502's, running in the way you described.
Code: Select all
___
/ __|__
/ / |_/ Groetjes, Ruud
\ \__|_\
\___| URL: www.baltissen.org
-
smilingphoenix
- Posts: 43
- Joined: 20 May 2006
- Location: Brighton, England
Many years back, I added a second processor to an Acorn Atom, running on the oposite phase of the clock. It can work, but the timing becomes much more critical, even at 2MHz! The two processors shared the ROMs and an 8k block of RAM used for inter-processor communications, while each having their own 32k of RAM for running programs. I managed to get both processors runing seperate BASIC programs using the same BASIC ROM.
I never really got into splitting a single task between the two processors, though. I mainly used the second CPU for calculating mandelbrot images in the background while I used the computer for something else (mainly space invaders
)
I wouldn't like to add more processors in this way, though. I suspect it would be better to add some sort of network if you want to use multiple processors.
I never really got into splitting a single task between the two processors, though. I mainly used the second CPU for calculating mandelbrot images in the background while I used the computer for something else (mainly space invaders
I wouldn't like to add more processors in this way, though. I suspect it would be better to add some sort of network if you want to use multiple processors.
Yeah, Commodore's PET-series disk drive units relied on twin processors. I'm not sure if the Commodore VIC-series (e.g., the 1540 through 1581) did or not though.
That being said, "SMP" (since it's basically what it is) is actually the simplest possible method of multiprocessing, since you don't need to invent a network system either. Realistically speaking, 4 CPUs is about as far as you can push a typical bus (at least with Intel and PowerPC processors); beyond that, performance starts to dwindle fast. I suspect 2 CPUs is the limit for the 6502/65816 because of its lack of cache memory and very high bus bandwidth requirements.
That being said, one can combine the benefits of SMP and networked designs by building networked clusters of twin-processor nodes, if required.
That being said, "SMP" (since it's basically what it is) is actually the simplest possible method of multiprocessing, since you don't need to invent a network system either. Realistically speaking, 4 CPUs is about as far as you can push a typical bus (at least with Intel and PowerPC processors); beyond that, performance starts to dwindle fast. I suspect 2 CPUs is the limit for the 6502/65816 because of its lack of cache memory and very high bus bandwidth requirements.
That being said, one can combine the benefits of SMP and networked designs by building networked clusters of twin-processor nodes, if required.
-
Nightmaretony
- In Memoriam
- Posts: 618
- Joined: 27 Jun 2003
- Location: Meadowbrook
- Contact:
For me, if I were to go multiple cpus, I would go with the common ram and each cpu while it has its own rom/ram would access the common ram in a round robin fashion. Each one can simply have data or programs passed back and forth with a token id for memory, or the ram can segment for each cpu.
"My biggest dream in life? Building black plywood Habitrails"