6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Thu Sep 19, 2024 11:57 pm

All times are UTC




Post new topic Reply to topic  [ 575 posts ]  Go to page Previous  1 ... 35, 36, 37, 38, 39
Author Message
PostPosted: Thu Sep 05, 2024 6:46 pm 
Offline
User avatar

Joined: Thu May 28, 2009 9:46 pm
Posts: 8387
Location: Midwestern USA
——— POC V1.5 ———

In my never-ending quest to get my contraptions to run faster and do more, I’ve been mulling several ways of progressing.  One thing I’ve been wanting to do is build a unit with more than the 128KB of RAM that is in POC V1.3 and V1.4.  My immediate goal is to move up to 512KB, which can be done with a single SRAM, but with a little more complication in the glue logic.

In a 65C816 system with more than 64KB of RAM, a simplistic glue logic setup will likely result in ROM and I/O being mirrored in higher banks, which is usually undesirable.  In my designs, ROM and I/O appear in bank $00, starting at $00C000 (I/O), $00D000 (ROM), and ending at $00FFFF.  Lacking proper glue logic, those items would also appear at $01C000, $01D000, $02C000, $02D000, etc..  If mirroring is to be avoided, logic has to know when A16-A23 is not $00.

In a 512KB system, bank $00 detection would be implemented with logic that performs the equivalent operation of BNK0 = !(A16 | A17 | A18), in which | represents logical ORBNK0 will be true only when the effective address is $00xxxx.  A triple-input NOR gate would have to be hooked up to the three address lines coming out of the transparent latch used to capture A16-A18, with the gate’s output fed back into the glue logic to tell it when the address is $00xxxx.  Simple enough...until one considers the timing.

The problem is the A16-A18 address component doesn’t immediately appear during Ø2 low, as the latch adds its propagation delay to the delay that occurs from the fall of the clock to when the 816 emits a valid address (the tADS spec).  Added to that will be the prop delay of the above-mentioned NOR gate before the glue logic actually knows if the address is, or is not, $00xxxx.

My timing analysis indicates that the “it won’t work” Ø2 threshold with this arrangement is around 18 MHz, using discrete 74AC or 74AHC logic.  Allowing for variations in prop delay that are inevitable from one part to another, I projected that 16 MHz would be the practical Ø2 “ceiling,” something that has been borne out by real-world testing.  So I needed to find another way.

A while back, I had mentioned the possibility of building POC V1.5 with discrete logic that could support 512 KB of RAM.  I decided it would take too many gates (equating to soldering a lot of SOIC parts) and might not perform any better than V1.3, which is stable at 16 MHz.  Given that, I have concocted a substantially different design that consolidates glue logic and bank bits latching in two GALs, appropriately named GAL1 and GAL2:D

Here’s the memory map this new contraption will set up:

    Code:
    000000-00BFFF — base RAM (48 KB)
    00C000-00C37F — Input/output (1 KB)
    00C380-00C3FF — not decoded
    00C400-00CFFF — RAM (3 KB)
    00D000-00FFFF — ROM (12 KB)
    010000-07FFFF — extended RAM (448 KB)

Although the “island” of RAM created at $00C400-$00CFFF will be globally accessible (there is no memory protection), my intended use for it will be to give the firmware and/or operating system a private storage area for its direct page, MPU stack and other needed work space.  By doing so, the entire 48 KB of base RAM will be available for user programs, with the firmware or OS transparently remapping direct page and the stack during system API calls.

The I/O block will be mapped as follows:

    Code:
    00C000 — serial I/O ports ‘A’ & ‘B’, timer ‘A’
    00C080 — serial I/O ports ‘C’ & ‘D’, timer ‘B’
    00C100 — serial I/O IRQ status*
    00C180 — real-time clock
    00C200 — expansion port chip select ‘A’*
    00C280 — expansion port chip select ‘B’
    00C300 — expansion port chip select ‘C’

    *Not wait-stated.

The above I/O map is like that of POCs V1.2, V1.3 and V1.4, excepting for different addresses.

In order to make all this work, I’ve assigned the two GALs the following functions:

  • GAL1 contains the address decoding logic and generates corresponding chip selects.  It also tells the clock generator when to stretch Ø2 high for the purpose of wait-stating ROM and I/O accesses.

    GAL1’s inputs are A7-A15, the aforementioned BNK0 (a low-true signal emitted by GAL2), VDA and VPA.  This GAL uses only combinatorial logic, with no pins being used as feedback nodes.  Hence it will respond to all input combinations in no more than tPD nanoseconds, tPD being the advertised pin-to-pin prop time of the device.
     
  • GAL2 serves multiple purposes:

    • Latch A16-A18 during Ø2 low;
    • Produce fully-qualified /RD (read data) and /WD (write data) control signals;
    • Aggregate three separate IRQ inputs into a single IRQ output;
    • Generate the BNK0 signal needed by GAL1 to make its logic decisions.

    GAL2’s inputs are Ø1, D0-D2 (which are A16-A18 during Ø2 low), RWB, VDA, VPA, IRQA, IRQB and IRQC.  This GAL uses both combinatorial and registered logic, but should still perform at tPD due to no pin feedback nodes being used.

As GAL1 depends on an output from GAL2 ( its !BNK0 signal), both devices need to be fast in order to support high Ø2 rates.  My choices for the GAL are Microchip’s ATF22V10C-7PX, basically the venerable 22V10, but with better tPD, or their ATF750C-7PX, the latter which may be described as a 22V10 on steroids—the 750C has about 40 percent more gates that the 22V10.  The two types are electrically interchangeable, but require somewhat different programming methods when using registered logic due to, among other things, the 750C’s more-flexible clocking.

Both parts have a 7.5ns tPD rating, which in my implementation, should be achievable due to no pins being used as feedback nodes.  Hence the worst-case elapsed time from when the 816 emits an address to when a chip select has been generated will be 15ns, a level of performance that I cannot consistently achieve with the fastest equivalent discrete logic.  15ns total prop delay should result in stable operation at 20 MHz.

V1.5 also incorporates my second-generation, stretchable Ø2 clock circuit that is capable of at least 40 MHz operation.  I’m shooting for a top speed of 20 MHz in V1.5, but may try to go faster if it works.

I should note that observations of a couple of ATF22V10Cs I have here indicate that these devices run about 25 percent faster than guaranteed. While that isn’t something that should be relied upon—minor changes in voltage and/or temperature could slow down things, it might mean that I could get POC V1.5 running above 20 MHz. The clock generator circuit is designed so an extended wait-state may be configured to handle above-20 MHz operation.

Attachment:
File comment: POC V1.5 Schematic
poc_v1.5.pdf [338.42 KiB]
Downloaded 8 times
Attachment:
File comment: Glue Logic Design Files
logic.zip [3.92 KiB]
Downloaded 7 times
Attachment:
File comment: Microchip 22V10 GAL
atf22v10c.pdf [1.87 MiB]
Downloaded 7 times
Attachment:
File comment: Microchip ATF750C “Super GAL”
atf750c.pdf [491.27 KiB]
Downloaded 7 times

_________________
x86?  We ain't got no x86.  We don't NEED no stinking x86!


Top
 Profile  
Reply with quote  
PostPosted: Sun Sep 15, 2024 6:50 am 
Offline
User avatar

Joined: Thu May 28, 2009 9:46 pm
Posts: 8387
Location: Midwestern USA
I’ve got a PCB layout done for POC V1.5. The board dimensions are 6-1/2" × 4", and it is in four layers.  I found that by positioning the SRAM and the data bus transceiver east-west, it was easier to make the connections without an excessive number of vias.  I also tinkered a bit with the schematic to add numerous test points that I can use with the logic probe, scope and/or logic analyzer to observe circuit operation.

Attachment:
File comment: POC V1.5 Schematic
poc_v1.5_sch.pdf [348.28 KiB]
Downloaded 6 times
Attachment:
File comment: POC V1.5 Printed Circuit Board
pocV1.5_pcb.gif
pocV1.5_pcb.gif [ 721.15 KiB | Viewed 96 times ]

_________________
x86?  We ain't got no x86.  We don't NEED no stinking x86!


Last edited by BigDumbDinosaur on Sun Sep 15, 2024 7:25 pm, edited 1 time in total.

Top
 Profile  
Reply with quote  
PostPosted: Sun Sep 15, 2024 7:10 am 
Offline

Joined: Mon Jan 19, 2004 12:49 pm
Posts: 840
Location: Potsdam, DE
Nice board layout, BDD. I assume the inner layers are the power.

But so many pullups? Doesn't the GAL pull to the rail? Or is it an open collector output?

And I'm unsure about the db[0..7] outputs after the buffer from the processor... high bits of the address? I'm only vaguely familiar with the 816.

Neil


Top
 Profile  
Reply with quote  
PostPosted: Sun Sep 15, 2024 7:24 pm 
Offline
User avatar

Joined: Thu May 28, 2009 9:46 pm
Posts: 8387
Location: Midwestern USA
barnacle wrote:
Nice board layout, BDD. I assume the inner layers are the power.

Thanks!

The inner layers are power and ground.

Quote:
But so many pullups? Doesn't the GAL pull to the rail? Or is it an open collector output?

A GAL has TTL-level outputs, not CMOS, with a guaranteed VOH of 2.4, well below what constitutes a valid CMOS logic 1 in a 5 volt system.

In my post about the GAL test rig I built, I mentioned that I did some output voltage checks.  I discovered that while a GAL could drive its unloaded outputs a little past 4 volts, the voltage quickly deteriorated with loading.  Assuming that an unambiguous CMOS logic 1 occurs at 70 percent of VCC, that would be 3.5 volts in a 5 volt system.  A GAL’s outputs barely stay in that range under load, so there is a risk of noise-sensitivity.  Of course, “under load” when driving a CMOS input mostly means charging parasitic capacitance, since CMOS devices draw virtually no input current when the voltage is steady.  Hence the use of the pullups is mainly to assist the GAL in charging the parasitic capacitance.

Quote:
And I'm unsure about the db[0..7] outputs after the buffer from the processor... high bits of the address? I'm only vaguely familiar with the 816.

The 65C816 multiplexes the A16-A23 address bits on D0-D7 during Ø2 low.  A transparent latch, GAL2 in this design, is used to capture and latch those bits—actually, A16-A18—and drive them onto the corresponding inputs of the SRAM.  When Ø2 goes high, there is a small amount of overlap before the 816 stops emitting A16-A23 and starts treating the data bus as a data bus.  This overlap gives the latch enough time to close on the rise of the clock before the 816 “turns around” the bus.

However, the overlap period also creates a window of opportunity for bus contention.  During a read cycle, /RD will be asserted right after the rise of the clock and the selected device will start driving the data bus, possibly while the 816 is still emitting A16-A23.  The resulting contention may inject of a lot of noise into the power and ground planes due to the momentarily-high current flow.  The transceiver, which will be in the high-Z state during Ø2 low, will remain in that state during the overlap period, thus closing the bus contention window.

The bus pullups prevent floating while the transceiver is in the high-Z state.  The transceiver also acts as a level converter, as both the SRAM and ROM have TTL-level outputs, whilst the 816’s inputs are CMOS.

_________________
x86?  We ain't got no x86.  We don't NEED no stinking x86!


Top
 Profile  
Reply with quote  
PostPosted: Sun Sep 15, 2024 7:35 pm 
Offline

Joined: Mon Jan 19, 2004 12:49 pm
Posts: 840
Location: Potsdam, DE
Gotcha, thanks.


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 575 posts ]  Go to page Previous  1 ... 35, 36, 37, 38, 39

All times are UTC


Who is online

Users browsing this forum: No registered users and 11 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: