Multiprocessing on FPGA using dual-port RAM (pipedream)

For discussing the 65xx hardware itself or electronics projects.
Post Reply
User avatar
BigEd
Posts: 11463
Joined: 11 Dec 2008
Location: England
Contact:

Multiprocessing on FPGA using dual-port RAM (pipedream)

Post by BigEd »

This was a pipe-dream thought, but perhaps worth sharing: a 6502 core on FPGA is pretty small, so it's surely possible to fit 4 or even maybe 16 of them on a reasonably-priced FPGA. As the RAM blocks (at least on Xilinx FPGAs) are dual-ported, it would be simple enough to hook every RAM up to a pair of 6502s, and if every 6502 was hooked up to 2 or to 4 RAMs, it would be possible to make a pipeline or a mesh (or a torus) of processors. It would be easy to share code too, if that makes sense, but most crucially the processors could communicate by posting data to the shared memory and setting a flag to indicate that it's ready.

With 4 neighbours, each 6502 would see 4 blocks of 2k RAM, all of which would be shared, but each block shared with a different neighbour. By convention part of each RAM would be private to one side or the other, and part would be shared. The address map might look interesting, being a patchwork, and each patch appearing in a different part of the address map on each side of the shared block.

Code: Select all

+------------------------------------------+
|                                          |
| +---+                                    |
| |   |                                    |
| | +-+--+ +---+ +----+ +---+ +----+ +---+ |
+---+6502| |RAM| |6502| |RAM| |6502| |RAM+-+
  | +----+ +---+ +----+ +---+ +----+ +---+  
  |                                         
  | +---+        +---+        +---+         
  | |RAM|        |RAM|        |RAM|         
  | +---+        +---+        +---+         
  |                                         
  | +----+ +---+ +----+ +---+ +----+ +---+  
  | |6502| |RAM| |6502| |RAM| |6502| |RAM|  
  | +----+ +---+ +----+ +---+ +----+ +---+  
  |                                         
  | +---+        +---+        +---+         
  | |RAM|        |RAM|        |RAM|         
  | +-+-+        +---+        +---+         
  |   |                                     
  +---+                                     
At the same time, this gives each processor more memory than it would otherwise have, and connects the processors together.

Just maybe, the zero page and stack could be implemented as distributed RAM, and therefore be private. Or maybe there's enough block RAM to have a private block as well as the shared blocks - depends on how big the FPGA is, and how many CPUs to squeeze in. As we know from the Atari 2600 and other efforts, we don't need a full page 0 or page 1 to make a viable machine. Even 64 bytes mapped into both pages can be useful.

As for programming such a network, well that's a software problem!

(The transputer was all about local memory and synchronous communication with up to 4 neighbours over a byte-wide channel, but we don't have an FPGA model for the transputer. It would of course be possible to design a byte-wide channel as a peripheral, but shared memory comes for free.)
User avatar
BigDumbDinosaur
Posts: 9425
Joined: 28 May 2009
Location: Midwestern USA (JB Pritzker’s dystopia)
Contact:

Re: Multiprocessing on FPGA using dual-port RAM (pipedream)

Post by BigDumbDinosaur »

BigEd wrote:
This was a pipe-dream thought...a 6502 core on FPGA is pretty small, so it's surely possible to fit 4 or even maybe 16 of them on a reasonably-priced FPGA. As the RAM blocks (at least on Xilinx FPGAs) are dual-ported, it would be simple enough to hook every RAM up to a pair of 6502s, and if every 6502 was hooked up to 2 or to 4 RAMs, it would be possible to make a pipeline or a mesh (or a torus) of processors...
What you have described sounds somewhat like a 6502 analog of AMD's hyper-transport feature that is used on Opteron motherboards to keep each processor core aware of what the other cores are doing. I suspect development of the VHDL to accomplish this would a lengthy process over and above development of the the 6502 cores themselves. Even more interesting would be the development of an assembler or compiler that could take advantage of multiple cores. With the ability to write threaded code we would have true multiprocessing with the 65xx family.
x86?  We ain't got no x86.  We don't NEED no stinking x86!
User avatar
BigEd
Posts: 11463
Joined: 11 Dec 2008
Location: England
Contact:

Re: Multiprocessing on FPGA using dual-port RAM (pipedream)

Post by BigEd »

Looks like Hypertransport is rather more sophisticated than what I was thinking of. The original transputer link protocol, although electrically serial, was byte-based at the logical level. Processes could exchange multi-byte messages, but it was up to the sender and receiver to agree on the number of bytes in each message, or to implement a variable-length message protocol on top of the byte-sized primitive.

In a shared-memory architecture, I think all that's needed is one or more buffers of agreed size (one byte or more) and for each buffer a control byte to indicate the status of the buffer (empty or full, owned by side A or side B, or whatever.) So, two bytes of memory space for each direction of each link would be minimal, and equivalent to the original transputer link.

(There was a later transputer link protocol, called DSLink, which was packet based and implemented multiple virtual channels over each physical link.)

Cheers
Ed
nyef
Posts: 235
Joined: 28 Jul 2013

Re: Multiprocessing on FPGA using dual-port RAM (pipedream)

Post by nyef »

You might not even need dual-port RAM for this. I could see doing this without the FPGA, with a '245 on the data bus between each CPU and RAM pair, something similar for the address lines, and adjacent CPUs operating on opposite sides of the phi2 cycle.
ElEctric_EyE
Posts: 3260
Joined: 02 Mar 2009
Location: OH, USA

Re: Multiprocessing on FPGA using dual-port RAM (pipedream)

Post by ElEctric_EyE »

Fascinating idea BigEd!

It wouldn't be that hard to write the Verilog for the BlocRAM's.
In fact I would like to implement your idea in my project after I add in the HIDs (i.e. mouse, keyboard, touchscreen etc.). As you say, the software for both machines would be taking care of everything.

I picture a common address block part of CPU1 and CPU2 zero page. Then the remaining zero page and stack be be unique to each CPU1 and CPU2.
User avatar
BigEd
Posts: 11463
Joined: 11 Dec 2008
Location: England
Contact:

Re: Multiprocessing on FPGA using dual-port RAM (pipedream)

Post by BigEd »

It's a fair point, nyef, that you only need a shared byte (or two) to arrange for a synchronous communication channel. So you could apply the same principle in the FPGA, and make all the memory private.

The LX9 has 32 block RAMs of 2kbyte each, so for a 16-core array, that would be 4 kbyte per node private, or 16 kbyte if it's shared four ways. (In many cases, all the cores would be running the same code, so sharing memory for that purpose is a win.)

Indeed, EEye, if you make a small window of page 0 shared, that will be fast and natural. But the memory map might get messy in the everything-shared case.

I think one could get quite creative in how to map the rams into the memory spaces, and how to control write access for maximum safety. Which is to say, I think there's a wealth of different possible ways of doing it.
nyef
Posts: 235
Joined: 28 Jul 2013

Re: Multiprocessing on FPGA using dual-port RAM (pipedream)

Post by nyef »

BigEd wrote:
It's a fair point, nyef, that you only need a shared byte (or two) to arrange for a synchronous communication channel. So you could apply the same principle in the FPGA, and make all the memory private.
Which wasn't where I was going with that. You can share a 32k RAM (or more!) between two 6502 CPUs if you run one of them on phi2 and the other on negated phi2, and isolate the RAM from whichever CPU is on the "down" part of its cycle so that the CPU on the "up" part of its cycle can do its thing. The Apple ][ uses basically this trick for video refresh, although there the "other CPU" is basically a state machine built out of discrete components.
User avatar
BigEd
Posts: 11463
Joined: 11 Dec 2008
Location: England
Contact:

Re: Multiprocessing on FPGA using dual-port RAM (pipedream)

Post by BigEd »

Oh, yes, understood - you were talking about discrete designs. I was just thinking that the same structure could be done on FPGA too.
Post Reply