6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Fri Nov 22, 2024 11:14 pm

All times are UTC




Post new topic Reply to topic  [ 10 posts ] 
Author Message
PostPosted: Wed Mar 06, 2019 7:05 am 
Offline

Joined: Sat May 26, 2018 1:00 am
Posts: 26
Location: Riverton, UT
It's been a while since I've posted here and time changes many things, including my 65C816 computer design :)

I was going to use an FPGA for the address/data demuxing and address decoding, along with video, sound, uart, and ps/2 keyboard support. I have since decided that I might be biting of more than I can chew and way over complicating things. I still plan on using the FPGA for video, sound, keyboard, etc. but I have decided to move the address demuxing and decoding out of the FPGA. This will allow me to get up and running without the FPGA in the design and should make it easier for me to modularize things a bit. Trying to manage all of that in the FPGA was getting pretty unwieldy for a noob like me. I have also decided to move from 3.3V to 5V to take advantage of some additional timing headroom and will access the 3.3v FPGA through some level shifters (Using the TI TXB0108PWR).

I have also settled on an aprox 7.14Mhz clock speed giving me a 140ns clock period. Without the FPGA in the system I will use a 7Mhz osc. (I'll actually start with a 1Mhz osc and work my way up if things are looking good) One I add the FPGA into the system, it will generate a clock with a 140ns period derived from it's internal 100Mhz clock.

To do my address/data demuxing I am planning on using the circuit described in the '816 datasheet using a 74AC245 and 74AC573. I have also planned out my address decoding and was wondering if somebody would be willing to give it a quick look and tell me if there are any gotchas in my design and if any of the signals might need some kind of gating with PHI2. This is just generating chip selects for the RAM, ROM, and I/O. It looks like the timing for RAM access will be really close, but I should have the RAM enable valid with about 2ns to spare. I have simulated the logic using Logisim and the logic itself does what I am wanting, but logisim does not simulate realistic propagation delays.
I am including an annotated screenshot from logisim and have added in the max propagation times next to the components and have signal valid times relative to the falling edge of PHI2. I am going to use SRAM and Flash with 55ns access times.
Attachment:
AddrDecode.png
AddrDecode.png [ 336.36 KiB | Viewed 873 times ]

One thing to note is that when I add in the FPGA video functionality it will share the bus with the CPU for memory access by pulling RDY and BE low to halt the CPU when it wants to access the bus. The timing of that will be synchronized with PHI2 with RDY going low about 20ns before the falling edge of PHI2 and BE going low with the falling edge. None of this is in the screenshot, just providing some context. All of the address decoding is after the address/data demuxing and will be active while the CPU is halted with the exception of the I/O address space. Access to I/O is only allowed for the CPU which is why the BE signal is included in the screenshot.

I also have a couple of general questions. I plan on using 74ACxx logic wherever possible but my need to resort to HC parts when AC is not available. Is it ok to mix AC and HC? I have seen the discussions regarding TTL vs HC and HCT, but nothing specifically about mixing HC and AC. I would think it would be fine to do that since they are both CMOS parts.

Another question is in regards to the 65xx support chips, (6522, etc.) and their CS signals. Garth Wilson has mentioned many times that they need to be asserted before the rising edge of PHI2 or they don't work (datasheet for the 65C22 says 10ns setup time). My decoding does not handle that. What would be a recommendation for that? I need to qualify the IO chip selects with the BANK0 signal to constrain IO to the first 64Kb of memory and I just can't figure out a way to do that more quickly. There is a good possibility that it would work fine since I am designing for max prop delays and when I figure out timing based on typical prop delays my timing is all good.

Thanks Guys.


Top
 Profile  
Reply with quote  
PostPosted: Wed Mar 06, 2019 10:47 am 
Offline

Joined: Mon May 21, 2018 8:09 pm
Posts: 1462
You should consider using multi-input (3 or 4) gates, or an 8-bit comparator chip, instead of a tree of 2-OR gates, to generate your BANK0 signal. I don't think there's an 8-OR in the 74 series, but there is an 8-NAND and a 4-NAND; the latter could be used with a quartet of 2-NORs.

There are many other places where your logic tree could reasonably be shortened, too. To help with gate selection, consider changing the polarity of internal signals from active-high to active-low or vice versa. CMOS gates tend to be easier to build in inverting (NAND, NOR) than non-inverting (AND, OR) form; the latter requires adding an inverter to the output of a naturally inverting gate, which makes it slower.


Top
 Profile  
Reply with quote  
PostPosted: Wed Mar 06, 2019 12:23 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10986
Location: England
Nice simplifying idea to keep the FPGA for only the complex and optional parts. It helps scale back the minimum viable design. Good step - hope to see updates as you make progress!


Top
 Profile  
Reply with quote  
PostPosted: Wed Mar 06, 2019 1:16 pm 
Offline
User avatar

Joined: Fri Dec 11, 2009 3:50 pm
Posts: 3367
Location: Ontario, Canada
bradleystach wrote:
I have decided to move the address demuxing and decoding out of the FPGA. This will allow me to get up and running without the FPGA in the design
Good call.

And yes it's OK to mix HC and AC parts, but of course HC is not very fast. AC parts are fast but have rather abrupt rise and fall times on their outputs, and this can present problems, especially with wireless breadboards. I suggest you use AHC or AHCT series parts as much as possible; this is newer technology. They're about as fast as AC but without the noisy rise and fall times on the outputs.

I'm sure your schematic can be simplified, for example by using wider gates as Chromatix suggests.

On the topic of signal polarity, you might be better off if you used a '563 address latch (rather than '573). The '563 has inverting outputs, which means all the outputs will be high when Bank 0 is accessed, and this might make it easier to design the decode logic.

Alternatively, the '573 / '563 question won't matter if you use a '688 or similar comparator to detect Bank 0. And, if you use its /Enable input, a '688 will let you test the state of nine signals... for example, it could test for all the Bank Address bits being low and BE being high.

Have fun, and keep us posted!

Jeff

_________________
In 1988 my 65C02 got six new registers and 44 new full-speed instructions!
https://laughtonelectronics.com/Arcana/ ... mmary.html


Top
 Profile  
Reply with quote  
PostPosted: Wed Mar 06, 2019 3:26 pm 
Offline

Joined: Sat May 26, 2018 1:00 am
Posts: 26
Location: Riverton, UT
Chromatix wrote:
You should consider using multi-input (3 or 4) gates, or an 8-bit comparator chip, instead of a tree of 2-OR gates, to generate your BANK0 signal. I don't think there's an 8-OR in the 74 series, but there is an 8-NAND and a 4-NAND; the latter could be used with a quartet of 2-NORs

I did originally have 3 and 4 input gates in the design, but when I sat down and started figuring out my timing, I found that the multi-input gates had fairly high propagation delays. For instance, the or tree generating the bank0 signal was originally 2 four input nor gates and a2 input nand. The delay through that was 27.5ns, with the individual gates it’s 22.5. The ‘688 comparator is 29ns.
When you factor in the 33ns it takes for the bank address to show up on the bus, and the 55ns I need to access memory and my target speed of 140ns cycle times, I only get 52ns to generate my chip selects.

As it is now I am still violating the timing on my VIAs. I can certainly slow things down a bit and go with a 160ns cycle time (approx 6.25Mhz) but I would be bummed out and would complicate some later plans.

Now I am not a EE as you can probably tell (software pays my bills :D and this is just a hobby) and reading the data sheets for these parts is a bit of black magic I have certainly not mastered. I am figuring out my timing by going off of max propagation times, am I being overly conservative? I always thought planning for the worst case was the best practice.

Thanks guys.


Top
 Profile  
Reply with quote  
PostPosted: Wed Mar 06, 2019 3:48 pm 
Offline

Joined: Sat May 26, 2018 1:00 am
Posts: 26
Location: Riverton, UT
Dr Jefyll wrote:
Alternatively, the '573 / '563 question won't matter if you use a '688 or similar comparator to detect Bank 0. And, if you use its /Enable input, a '688 will let you test the state of nine signals... for example, it could test for all the Bank Address bits being low and BE being high.


Won’t I still need some kind of latch to hang on to the bank address during PHI2 high when data gets driven to the bus?

Also, as far as AC parts being noisy, there will be no breadboards in this design. I test some things out on breadboards at low speeds, but this will end up on 4 layer PCBs, with ample bypassing at each chip so I think the noise should be manageable. Although, using HCT/ACT would open up some possibilities as far as using some LS here and there in other parts of the computer. Will have to look into that. Although I’m not sure I want the added power budget for LS.


Top
 Profile  
Reply with quote  
PostPosted: Wed Mar 06, 2019 8:16 pm 
Offline
User avatar

Joined: Fri Dec 11, 2009 3:50 pm
Posts: 3367
Location: Ontario, Canada
bradleystach wrote:
Won’t I still need some kind of latch to hang on to the bank address during PHI2 high when data gets driven to the bus?
Yes. Maybe I didn't explain very well. Let's say you use a '573 to latch the Bank Address, which is the usual way of doings things. And we think of the latch outputs as being address lines A23 - A16.

I'm just saying that the '688 ('521 is another version) can be used as a nine-input gate. You can arrange a '688 or '521 so its output will go low only when BE is high and A23 - A16 are all low. It seems to me that might help simplify your circuit.

Quote:
using HCT/ACT would open up some possibilities as far as using some LS here and there
HCT, ACT and (better yet) AHCT are one way to handle the problem of accepting a signal which is at TTL levels. But IMO it's better to just avoid the problem in the first place -- IOW, don't use chips whose outputs produce TTL levels!

The only exceptions would be chips that have something special to offer, such as extreme speed, or a function which is not available in more modern logic families. (And LS chips generally don't offer anything special.)

A better example would be the comparator function I mentioned above. The fastest version in Digikey's database is the 74FCT521, whose max prop delay is 4.5 ns. :shock: But its output conforms to TTL voltage levels.

_________________
In 1988 my 65C02 got six new registers and 44 new full-speed instructions!
https://laughtonelectronics.com/Arcana/ ... mmary.html


Top
 Profile  
Reply with quote  
PostPosted: Wed Mar 06, 2019 11:25 pm 
Offline

Joined: Sat May 26, 2018 1:00 am
Posts: 26
Location: Riverton, UT
Thanks for the suggestions everyone. I love the level of knowledge and experience on this forum and the willingness of everyone to help and be supportive, and to be patient with the dummies like me :)

I have one more question about figuring out my signal propagation delays, I have been adding up the max delays for each of the components along my signal paths to get an idea of what my times might be.

I understand that the times are determined primarily by temperature and capacitive loading for a given voltage and the max figures in the datasheets are generally for higher temperatures and higher capacitive loads, worst case scenario kinda stuff. Am I being too conservative in figuring out my signal delays? Most of my gates don't have extreme fan outs (well the bank0 signal might) and all of these components are going to be relatively close on the boards so I don't think trace lengths will be too big of an issue. Since this system is going to basically be a toy sitting on my desk and not sitting in some kind of industrial setting should I be a little less conservative?

The reason I'm asking is because I have looked at the schematics for the SXB '816 dev board sold by WDC and noticed that all of the components where they call out part numbers in the schematic are just 74HCxx parts and I imagine that the other bits of glue logic are HC or AC as well. When figuring out what kind of delays they would see based on the parts I know and making a semi educated guess as to what the others would be, it seems that they would have the same problems with timing at 8Mhz I am thinking I have. Obviously their boards work or I don't think they would be selling many and I haven't heard anything about them being unstable. Any guidance on the best way to figure these things out? It sure would be great to be able to go back to the nice and easy three and four input gates I had ruled out earlier.


Top
 Profile  
Reply with quote  
PostPosted: Thu Mar 07, 2019 6:56 am 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10986
Location: England
Ultimately, it's a judgement call. As you say, for a volume product and for someone with a reputation, using worst case figures is the safe thing to do. In most circumstances, it's too safe, and you can do more with less, at some risk.

The two things you can control are voltage and temperature. You're unlikely to be at worst case values for either, and for a hobby project, if you need to wait for the heatwave to pass or need to tweak your power supply, that's fine.

The other variable is risk and reliability. You're not selling, you don't have volume, you don't plan to run a life-critical system on your design. So again, you can take some margin for that.

I don't know how much difference the above makes. Perhaps a factor of two. One argument is that if you start by building an unsafe design, you give yourself a debugging challenge, you possibly end up with unreliability, and you might end up discouraged. Another argument is that you can learn more by failing.


Top
 Profile  
Reply with quote  
PostPosted: Sat Mar 09, 2019 6:15 pm 
Offline
User avatar

Joined: Fri Dec 11, 2009 3:50 pm
Posts: 3367
Location: Ontario, Canada
This is an interesting problem to play with, and this morning here's what I came up with. :)
Attachment:
AddrDecode variation A.png
AddrDecode variation A.png [ 8.85 KiB | Viewed 685 times ]

It helps to think of the entire 16MB space as being RAM by default. From that POV, the goal is to detect anything that isn't RAM.... and that's what the '688 / '521 comparator does. The comparator output is low (and the ROM /CS is active) for the region $8000-$FFFF.

I didn't draw any RAM chip-select logic, but you can use any scheme you like -- there's no requirement for the RAM chip-select logic to pay attention to the comparator. Yes there's still a need to avoid bus contention, but that's done by feeding the comparator output into the RAM output-enable logic. (It could also be fed into the write-enable logic, which would prevent the CPU from writing to the RAM "underneath" the ROM. But preventing those writes may be unimportant, as the CPU can't read from that area of RAM.)

To achieve a faster, simpler circuit I altered the original mapping specification. You still get chip-selects for three I/O devices occupying $400 bytes each and four I/O devices occupying $100 bytes each (as with "I/O memory map" in your diagram). But originally the I/O space was a segment taken from the top of the 0-$7FFF space otherwise occupied by RAM. Now it is a segment taken from the bottom of the $8000 - $FFFF space otherwise occupied by ROM. The amount of usable ROM space is decreased somewhat, but the payoff is simpler logic -- the comparator output gets used to enable both ROM and I/O.

Just as the RAM space has a portion removed to serve as ROM, so too the ROM space has a portion removed to serve as I/O. That's determined by a five-input gate fed with BE, A14, A13, A12 and the comparator output. The gate could be a suitably connected 74_138; or, a 74F260 may be somewhat faster -- this is shown in the alternate schematic (appended). A 74CBT3251 would be faster still, but that's tricky to explain. Do you really need to use BE in this logic? What would be the harm if I/O were visible while the FPGA has control of the address bus?

I used two I/O decoders operating concurrently because it's faster than the other solution where the delays are consecutive:
Attachment:
AddrDecode excerpt.png
AddrDecode excerpt.png [ 7.42 KiB | Viewed 685 times ]

cheers
Jeff


Attachments:
AddrDecode variation B.png
AddrDecode variation B.png [ 9.77 KiB | Viewed 685 times ]

_________________
In 1988 my 65C02 got six new registers and 44 new full-speed instructions!
https://laughtonelectronics.com/Arcana/ ... mmary.html
Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 10 posts ] 

All times are UTC


Who is online

Users browsing this forum: No registered users and 31 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: