6502.org • View topic - Externalized Wishbone Bus Design

View unanswered posts | View active topics

Board index » 6502.org Users Forum » Programmable Logic

All times are UTC

Externalized Wishbone Bus Design

Page 1 of 2

[ 28 posts ]

Go to page 1, 2 Next

Print view

Previous topic | Next topic

Author

Message

kc5tja

Post subject: Externalized Wishbone Bus Design

Posted: Thu May 26, 2016 6:45 am

Joined: Sat Jan 04, 2003 10:03 pm
Posts: 1706

Forgive me for not making this 6502/65816 specific; however, I don't know where else to post this to solicit design review feedback from people I actually trust on the matter.

This message is something of a request to review my work by Garth Wilson and/or Andre Fachat primarily; however, I also know others frequent this forum who would have relevant experience as well. I'd like to hear any feedback on the following design ideas.

As folks might already know, I've been working on building my own home computer for some time. It started with the 65C816-based Kestrel-1, then moved to a custom CPU design on an FPGA with the Kestrel-2, and I'm changing things up again for the Kestrel-3 (this time using a 64-bit RISC-V compatible CPU). Due to unfortunate circumstances combined with a dash of impatience, instead of using a COTS FPGA development board with millions of gates equivalent for the Kestrel-3, I've decided I want to try building a computer using a backplane, similar in spirit to the RC2014 Z-80-based computer (https://www.tindie.com/products/Semacht ... puter-kit/), or Andre's own Caspaer (http://6502.org/users/andre/csa/index.html#caspaer). This is in part motivated by "Big FPGA" companies failure to make reliable software that I can run without much headache, and the fact that the open source toolchain, yosys and iceStorm, only currently works with Lattice iCE40-based parts, which are significantly smaller than the usual Xilinx or Altera FPGA found in a dev board. Thus, I need to decompose the Kestrel into two or three FPGA chips, each doing their own thing.

Ideally, I would place all two or three of these chips on a single circuit board; however, I determined going this route will not support rapid turn-around time or minimum financial expense while I'm still learning how to work with FPGAs at this level. I fully envision the case where I hack on an FPGA circuit, and end up ruining the motherboard and will need to order another batch. Since this will be relatively expensive ($5/sq in for 2-layer board, $10/sq. in. for 4-layer board), I want to minimize the size of the PCBs for each functional unit under development. This just screams backplane. I figure once I have a working constellation of cooperating PCBs, I can "cost reduce" the design into a single board with much greater confidence and probability of success.

My plans for the backplane consist of a 16 sq inch PCB equipped with room for four DIN 41612 sockets. The vast majority of the pins will be bussed together; only a small number of pins are not (which I explain below). The pin-out follows. Note that undocumented pins MAY NOT be bussed, and cards ARE NOT to be connected to them. That way, I can define uses for them later with a reduced concern for backward compatibility.

Code:

        A       B       C
    1   D0      +5V     WE
    2   D1      +5V     A1
    3   D2      +5V     A2
    4   D3      +5V     A3
    5   D4      +5V     A4
    6   D5      +5V     A5
    7   D6      +5V     A6
    8   D7      +5V     A7
    9   D8      GND     A8
    10  D9      GND     A9
    11  D10     GND     A10
    12  D11     GND     A11
    13  D12     GND     A12
    14  D13     GND     A13
    15  D14     GND     A14
    16  D15     GND     A15
    17  50MHz   GND     A16
    18  RESET   GND     A17
    19  CDONE   GND     A18
    20          GND     A19
    21          GND     A20
    22          GND     A21
    23          GND     A22
    24  SEL0    GND     A23
    25  SEL1    +5V     A56
    26  ACK     +5V     A57
    27  STB     +5V     A58
    28  CYC#    +5V     A59
    29  CYCA    +5V     A60
    30  BCL#    +5V     A61
    31  BGO     +5V     A62
    32  BGI     +5V     A63

Note that a 50MHz reference clock exists to synchronize bus transactions; specifically, all transitions on the bus happen on the rising edge of the 50MHz clock. Personally, I also intend on driving my FPGAs with this reference clock. On paper, this bus should be capable of 100MBps data transfer performance; however, I doubt I'll ever see that in reality. I only need 25MBps throughput to feed the video circuits fast enough. (Both the CPU and the video hardware compete for memory access using the bus arbitration mechanism.)

Pins which are NOT bussed are:

* CDONE -- driven high only when all FPGAs on the card have been configured. RESET is the logical NAND feedback of all CDONE pins. (E.g., if any one CDONE pin is low, RESET is high.)

* CYC# -- driven low only when the card wants to start a transfer cycle on the bus. Similarly, CYCA (Cycle Announce) is the NAND of all CYC# pins. Note that a card must drive CYC# low only when it has permission to. The reason this isn't open-drain is because I needed CYCA to respond within a single 50MHz cycle.

* BGI, BGO -- these form a daisy chained, decentralized, round-robin bus arbitration mechanism. When RESET is asserted, all cards must drive BGO low (the backplane will drive BGI of the left-most card high). A card "has permission" to drive CYC# if, and only if, BGI XOR BGO = 1. If a card doesn't want the bus (anymore), it just passes BGI to BGO for the benefit of the next card. The right-most plug's BGO pin is connected to the left-most plug's BGI pin through an inverter. Unoccupied slots require you to use a jumper from BGI to BGO (like VMEbus).

Only BCL# is open-drain/open-collector. It is used by a card that *really* wants the bus and prefers not to wait its usual turn. It's a polite request to the current bus master to cut its current tenure short if it can. All other pins are actively driven or are three-state in nature. This bus follows Wishbone bus semantics, modified as appropriate to support chip-to-chip and card-to-card interconnects. The Wishbone specs are fully open and easily accessible via Google.

You may be wondering what happened to A24-A55 -- those are not exposed on the bus, due in part because I just don't need them for my needs. My goal is to support up to 16MB of video memory (hence A1-A23), but most other peripherals are substantially smaller than this. I do use the upper-most byte for address decoding though. The goal of this project is two-fold: (1) to help me learn FPGA-based construction techniques, and not so much to realize a finished, commercializable product; and (2), to hopefully get real hardware working by January 2017, in time for the next RISC-V Workshop.

Interrupts are handled through "message-signalled interrupt" mechanism. This is where a controller takes control of the bus and issues a memory write to a special, well-known memory location. The CPU card would monitor the bus for these writes, and upon seeing such, issue a local interrupt.

Anyway, I have 16 positive supply rails, and 16 grounds. The positive supply is 5V, but the rest of the logic in the system runs at 3.3V (FPGA cores run at 1.2V or less). Since an FPGA circuit might have up to four supplies (typically, 3.3V, 1.8V, 1.2V, and/or 0.9V), I just decided to stick with 5V and let each card regulate its own supply. This is more power-hungry, BUT, more flexible, and help R&D efforts. I placed them in column B of the DIN connectors in an effort to help manage high-frequency signal aberrations.

Wow, that was a lot and it may seem like some rambling. But for those with experience designing multi-drop bus systems, I would love to be made aware of any pitfalls before moving forward with this design, particularly with respect to power and ground rails or signal integrity issues. Thanks in advance!

Last edited by kc5tja on Thu May 26, 2016 1:56 pm, edited 1 time in total.

Top

MichaelM

Post subject: Re: Externalized Wishbone Bus Design

Posted: Thu May 26, 2016 11:31 am

Joined: Mon Apr 23, 2012 12:28 am
Posts: 760
Location: Huntsville, AL

kc5tja:

Although I've not followed your RISC-V project closely, I was intrigued enough by it to seriously consider using the core in a future project.

I looked over your 41612 definition, and I'd like to offer two options for your consideration:

(1) Consider using a surplus VME P2 backplane. It will define a 16-bit data bus, and an 8-bit data bus. Furthermore, iIt will be constructed with good controlled impedance traces. From this base, you can define the remainder of the power and signal using the User I/O pins. Furthermore, in the short term, you may be able to use surplus VME P2 User I/O backplanes such as RACEWAY to provide the interconnect for the 64 pins not supported by the VME P2 backplane. Alternatively, there exists a series of Insulation Displacement Connectors (IDCs) that can plug 0.05" ribbon cable directly onto the 64 User I/O pins of the VME P2 connectors.

(2) Consider using a serial LVDS interconnect. Some FPGAs, I don't know about the iCE40 series, offer high speed transceivers directly in the I/O banks. Either these native interfaces, or external serializers/deserializers (I've used the Cypress DS92LVxxx parts which appear to now be supported by Texas Instruments) may provide the performance that you're looking for at lower technical risk. The number of signals that can be changing state is pretty high in your parallel bus definition, so I would be concerned about the number of simultaneous switching outputs.

Good luck on your project.

_________________
Michael A.

Top

kc5tja

Post subject: Re: Externalized Wishbone Bus Design

Posted: Thu May 26, 2016 1:53 pm

Joined: Sat Jan 04, 2003 10:03 pm
Posts: 1706

MichaelM wrote:

(1) Consider using a surplus VME P2 backplane. It will define a 16-bit data bus, and an 8-bit data bus. Furthermore, iIt will be constructed with good controlled impedance traces. From this base, you can define the remainder of the power and signal using the User I/O pins. Furthermore, in the short term, you may be able to use surplus VME P2 User I/O backplanes such as RACEWAY to provide the interconnect for the 64 pins not supported by the VME P2 backplane. Alternatively, there exists a series of Insulation Displacement Connectors (IDCs) that can plug 0.05" ribbon cable directly onto the 64 User I/O pins of the VME P2 connectors.

Literally every VME-based commercial backplane product I've seen exceeded (or nearly exceeded) my budget, and that's just for the bare PCB. http://www.ebay.com/itm/like/2013268589 ... noapp=true is the cheapest I've found so far; I can do better by roughly half that price if I just fab my own, and then, at prototype pricing.

Quote:

(2) Consider using a serial LVDS interconnect.

I'm going to pass on this. I've considered narrow interconnects (everything from bit-serial interconnects to 16-bit wide RapidIO links), and dismissed them for several reasons:

1) CPU talks to memory over these links. I have no cache on the CPU card, and at this point, am not prepared to introduce one now. It will be hard enough getting the design working as it is. I'm planning on dividing the 50MHz reference by 4 to give a 12.5MHz core clock. This will let me interface with COTS (and cheap and plentiful!!) 70ns access RAMs and flash chips. With a 16-bit data path to RAM, this means it will be able to execute, at most, 6.25MIPS peak. It'll end up being slower than this in practice. Using a serial interconnect would be a disaster from a performance POV.

2) It greatly complicates the Verilog I have to write. I need to implement two Wishbone/serial transceivers for the CPU, two for the RAM interface, and at least one for the video interface, and I'd need to do it with sufficient performance that I can stream 640x480 256-color video. The RAM will then need arbitration logic to implement (basically) a switch for the serial protocol I come up with. I did consider RapidIO, and tried to come up with some state machines for driving such a thing, but gave up.

3) It requires that I draw out impedance-controlled PCB traces, which is a level of expertise I do not have.

Long-term I/O expansion plans are to include RapidIO-compatible interconnects. For just getting something up and running right now before a hard deadline less than a year away, I think it's too much to ask for.

Quote:

Some FPGAs, I don't know about the iCE40 series, offer high speed transceivers directly in the I/O banks.

iCE40s have a LUT-mesh, a modest amount of block RAM, and 2 PLLs; that's it. While their I/O pins can be configured to use LVCMOS25 or LVCMOS18 voltage levels easily enough, you are responsible for synthesizing driver logic for them directly. Also, the maximum speed of their mesh is only around 533MHz, and for individual I/O pins around 200MHz.

Quote:

Either these native interfaces, or external serializers/deserializers (I've used the Cypress DS92LVxxx parts which appear to now be supported by Texas Instruments) may provide the performance that you're looking for at lower technical risk. The number of signals that can be changing state is pretty high in your parallel bus definition, so I would be concerned about the number of simultaneous switching outputs.

Yes; on average, half of the pins could change state with each clock. I'm not sure why this is a problem? Part of the reason for providing lots of +5V on the connector is to provide plenty of power for the FPGA(s) to avoid voltage sags.

Quote:

Good luck on your project.

Thanks!

Top

BigEd

Post subject: Re: Externalized Wishbone Bus Design

Posted: Thu May 26, 2016 2:26 pm

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10985
Location: England

Hi Sam
interesting project. You might have missed that SSO is a term of art - FPGAs offer lots of pins but often have restrictions on how many you can switch at once. So, limited support for parallel outputs - the question remaining is whether your proposed use (with your preferred FPGAs) is below that limit or above it.

Cheers
Ed

Top

kc5tja

Post subject: Re: Externalized Wishbone Bus Design

Posted: Thu May 26, 2016 3:18 pm

Joined: Sat Jan 04, 2003 10:03 pm
Posts: 1706

BigEd wrote:

I've never heard of such a thing. After reading the data sheet for iCE40 FPGAs, I've not seen any indications that there were limitations on the number of I/Os that can change state concurrently. Specifically, I've been reading http://www.mouser.com/ds/2/225/iCE40Fam ... 311139.pdf .

That said, on average, 32 pins (worst-case, less than 64). I know for a fact that FPGAs have been used to interface to PCI buses for years now, and those average 44 pins. Another factor to consider is that I'm not driving the bus at 50MHz. Although Wishbone says that you can exchange data in a single clock, over a backplane, this won't be practical due to capacitance of the interconnects. Electrically speaking, the fastest you can reasonably get out of a parallel backplane is 33 to 40MHz or so. The fastest my bus will be running at is 12.5MHz; the 50MHz is just a reference clock. I chose 12.5MHz because of RAM access speed constraints and the 16-bit data path allows me to stream 640x480 256 color frame buffers.

Though, this does give me a good test case to code in Verilog (something that toggles every I/O pin on the chip every clock cycle) and measure with an o'scope to see what happens. I would need to fab a PCB to do this, and that means I need connectors to test with, and that means I might as well have a backplane.

If the CPU had a cache on-board, this wouldn't be an issue; I'd switch to a narrow, burst-mode bus in a heartbeat. Alas...

Top

GARTHWILSON

Post subject: Re: Externalized Wishbone Bus Design

Posted: Thu May 26, 2016 8:11 pm

Joined: Fri Aug 30, 2002 1:09 am
Posts: 8543
Location: Southern California

kc5tja wrote:

As folks might already know, I've been working on building my own home computer for some time. It started with the 65C816-based Kestrel-1, then moved to a custom CPU design on an FPGA with the Kestrel-2, and I'm changing things up again for the Kestrel-3 (this time using a 64-bit RISC-V compatible CPU). Due to unfortunate circumstances combined with a dash of impatience,

Ah! If you don't settle on something, you'll never finish and get it going! :lol:

I've had so many false starts that I've concluded I'll never arrive at the ideal solution, and I just have to pick a way and go with it. (That doesn't make me feel like it's settled, but it seems to improve my chances of getting some success.)

Quote:

Since this will be relatively expensive ($5/sq in for 2-layer board, $10/sq. in. for 4-layer board), I want to minimize the size of the PCBs for each functional unit under development.

PCB prices for hobbyists keep falling. Ultra-inexpensive board suppliers that have been pointed out here recently are:

http://www.dirtypcbs.com/
http://pcbshopper.com/
http://www.pcbway.com/

See if any of these makes life better for you.

Quote:

I would love to be made aware of any pitfalls before moving forward with this design, particularly with respect to power and ground rails or signal integrity issues. Thanks in advance!

Edit: I wrote the following response to the original post before realizing that only one signal would be 50MHz. The others, depending on slew rate, should be less critical.

I would interleave the ground and power pins. Ideally the powers are all bypassed to ground with chip capacitors where the connector pins are soldered to the boards, but there is some inductance even in the bypass caps too (keeping in mind that you're doing 50MHz, which requires reasonable behavior to half a GHz or so), so it might be good to make it such that every signal line has a true ground as close as practical. IOW, instead of grouping all the ground pins together and having two large groups of power pins, make it ground, power, ground, power, etc.. If you can afford it, have both a ground plane and a power plane, on inner layers. I believe Dr. Howard Johnson said in his writings on high-speed digital design—although I have not been able to find it again to verify—that there is no inductance in the power and ground of the board itself in this scheme, meaning ICs don't even need local power bypass capacitors this way. (This is for infinite planes, which can be approximated by not putting connections and traces all the way out to the edge of the board and its plane layers, and by keeping the planes close together (as they will automatically be if they're on adjacent layers.)

Regarding the distance a signal has to go, and bus terminations, here's a repeat of my post at viewtopic.php?p=30978#p30978:

https://web.archive.org/web/20080705140 ... s/2_19.htm

(I imagine your rise times will be a few times as fast as the 3ns example given above, so scale accordingly.) Ideal of course is not to use diodes (especially if they're not even fast enough to help), but instead a pair of resistors, one to power and one to ground, and match them to the characteristic impedance of the trace, which you'll have to calculate (this impedance calculator should help); but then it gets more complicated when you have loads along the line (as in a backplane situation) and not just at the ends.

_________________
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?

Top

BigEd

Post subject: Re: Externalized Wishbone Bus Design

Posted: Thu May 26, 2016 9:30 pm

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10985
Location: England

(It's certainly a simplifying factor if the FPGAs of choice don't have SSO limitations - the Xilinx ones do.)

Top

kc5tja

Post subject: Re: Externalized Wishbone Bus Design

Posted: Thu May 26, 2016 10:07 pm

Joined: Sat Jan 04, 2003 10:03 pm
Posts: 1706

GARTHWILSON wrote:

Ah! If you don't settle on something, you'll never finish and get it going!

Well, if I could get my commercial FPGA software to license properly, I could resume work with the Digilent Nexys-2 board that I already have. Someone on my Hackaday.io project page had brought up a tip for me to try; if it works, that'd be awesome.

But, honestly, I've been wanting to build my own FPGA dev board for some time already, and now that an open-source toolchain is available for Lattice iCE40s, I figured now is as good a time as any to start work on that project.

Quote:

I would interleave the ground and power pins.

I was thinking of doing that originally, but then I thought that would be a suboptimal arrangement. Moral of the story: never distrust your hunches.

Quote:

If you can afford it, have both a ground plane and a power plane, on inner layers.

I'd like to do this; the fab I was looking at does not support hidden vias though. I'll checkout the fabs you mentioned above to see if they fare better in this regard.

Quote:

I figured I would need some amount of termination, but I'd honestly never heard of diode termination before. I'm not even sure that would work well with 3.3V logic, since a single diode drop would bring the signal dangerously close to the lower bounds of Voh. What I didn't expect was termination into the tap of a voltage divider. Why wouldn't I want to just use a single resistor to ground?

Top

GARTHWILSON

Post subject: Re: Externalized Wishbone Bus Design

Posted: Thu May 26, 2016 10:22 pm

Joined: Fri Aug 30, 2002 1:09 am
Posts: 8543
Location: Southern California

kc5tja wrote:

GARTHWILSON wrote:

If you can afford it, have both a ground plane and a power plane, on inner layers.

I'd like to do this; the fab I was looking at does not support hidden vias though. I'll checkout the fabs you mentioned above to see if they fare better in this regard.

I've done multilayer boards with power and ground planes many times for work, but I've never used blind or buried vias. I don't think you'll need to either. Blind and buried vias do add to the manufacturing expense.

Quote:

There are arrays of 16 Schottky diodes in one package for eight lines. The Schottky diodes keep the signal voltage from going more than about 0.2V outside the rails.

Quote:

What I didn't expect was termination into the tap of a voltage divider. Why wouldn't I want to just use a single resistor to ground?

That just makes it harder to pull up. The pair makes it symmetrical and reduces the peak DC current. (Make each resistor twice the resistance of the intended load.)

_________________
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?

Top

Dr Jefyll

Post subject: Re: Externalized Wishbone Bus Design

Posted: Sat May 28, 2016 3:36 am

Joined: Fri Dec 11, 2009 3:50 pm
Posts: 3367
Location: Ontario, Canada

GARTHWILSON wrote:

Quote:

What I didn't expect was termination into the tap of a voltage divider. Why wouldn't I want to just use a single resistor to ground?

That just makes it harder to pull up. The pair makes it symmetrical and reduces the peak DC current. (Make each resistor twice the resistance of the intended load.)

An alternative way to terminate to the 50% point is to have just a single resistor for each signal, but have the other ends of all the resistors tie to the output of a push-pull regulator designed for this purpose. Fewer resistors are required, and power consumption tends to be lower -- as in the best-case scenario when there are as many lines high as low. Here are datasheets for two such regulators.

Attachment:

RT9045.pdf [164.72 KiB]
Downloaded 238 times

Attachment:

RT9199.pdf [172.36 KiB]
Downloaded 243 times

GARTHWILSON wrote:

There are arrays of 16 Schottky diodes in one package for eight lines.

Not sure if you maybe meant to say 32 Schottky diodes in one package for sixteen lines, but the 74s1053 is such a device. And the 74s1051 has 24 Schottky diodes in one package for twelve lines. Datasheets attached.

-- Jeff

Attachment:

sn74s1051.pdf [600.65 KiB]
Downloaded 202 times

Attachment:

sn74s1053.pdf [568 KiB]
Downloaded 204 times

_________________
In 1988 my 65C02 got six new registers and 44 new full-speed instructions!
https://laughtonelectronics.com/Arcana/ ... mmary.html

Top

BigDumbDinosaur

Post subject: Re: Externalized Wishbone Bus Design

Posted: Sat May 28, 2016 6:48 am

Joined: Thu May 28, 2009 9:46 pm
Posts: 8505
Location: Midwestern USA

Dr Jefyll wrote:

That is the basic method used in active parallel SCSI bus termination. However, the characteristic impedance of the cable is low, about 110 ohms, and hence the regulator has to provide quite a bit of current, typically 45ma per terminated line.

_________________
x86? We ain't got no x86. We don't NEED no stinking x86!

Top

kc5tja

Post subject: Re: Externalized Wishbone Bus Design

Posted: Mon May 30, 2016 4:38 pm

Joined: Sat Jan 04, 2003 10:03 pm
Posts: 1706

After trying to route DIN 41612 connectors on a PCB layout, I'm thinking about switching to two 2x20 box header plugs instead. These can be routed on a single side of the PCB, meaning I can get away with just a 2-layer PCB, cutting my fab cost basically in half. Plus, a pair of box headers combined are 2/3rds the cost of a DIN 41612 connector. My new proposed pin-out is at https://github.com/KestrelComputer/backbone/issues/2

Since I'm down 16 pins this way, I'm sacrificing most of the +5V supplies in favor of keeping as many grounds as I can get away with. Thoughts?

Top

BigEd

Post subject: Re: Externalized Wishbone Bus Design

Posted: Mon May 30, 2016 4:47 pm

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10985
Location: England

Might be worth a quick calculation of nominal connector resistance and expected current draw and see what kind of voltage drop you might suffer.

Top

kc5tja

Post subject: Re: Externalized Wishbone Bus Design

Posted: Mon May 30, 2016 4:51 pm

Joined: Sat Jan 04, 2003 10:03 pm
Posts: 1706

Can you elaborate on why?

EDIT: Are you referring to the +5V supply? Do you think I should go back to an even mix of +5V and GNDs?

Top

BigEd

Post subject: Re: Externalized Wishbone Bus Design

Posted: Mon May 30, 2016 4:54 pm

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10985
Location: England

Well, I was thinking you're distributing a nominal 5V and you don't want it degraded on your daughterboards. But on reflection, I think you said each board will be regulating locally down to a lower rail like 3V3 or below, in which case you will have voltage to spare.

Top

Page 1 of 2

[ 28 posts ]

Go to page 1, 2 Next

Board index » 6502.org Users Forum » Programmable Logic

All times are UTC

Who is online

Users browsing this forum: No registered users and 3 guests

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum