Joined: Thu May 28, 2009 9:46 pm Posts: 8504 Location: Midwestern USA
|
——— POC V1.5 ———
In my never-ending quest to get my contraptions to run faster and do more, I’ve been mulling several ways of progressing. One thing I’ve been wanting to do is build a unit with more than the 128KB of RAM that is in POC V1.3 and V1.4. My immediate goal is to move up to 512KB, which can be done with a single SRAM, but with a little more complication in the glue logic.
In a 65C816 system with more than 64KB of RAM, a simplistic glue logic setup will likely result in ROM and I/O being mirrored in higher banks, which is usually undesirable. In my designs, ROM and I/O appear in bank $00, starting at $00C000 (I/O), $00D000 (ROM), and ending at $00FFFF. Lacking proper glue logic, those items would also appear at $01C000, $01D000, $02C000, $02D000, etc.. If mirroring is to be avoided, logic has to know when A16-A23 is not $00.
In a 512KB system, bank $00 detection would be implemented with logic that performs the equivalent operation of BNK0 = !(A16 | A17 | A18), in which | represents logical OR — BNK0 will be true only when the effective address is $00xxxx. A triple-input NOR gate would have to be hooked up to the three address lines coming out of the transparent latch used to capture A16-A18, with the gate’s output fed back into the glue logic to tell it when the address is $00xxxx. Simple enough...until one considers the timing.
The problem is the A16-A18 address component doesn’t immediately appear during Ø2 low, as the latch adds its propagation delay to the delay that occurs from the fall of the clock to when the 816 emits a valid address (the tADS spec). Added to that will be the prop delay of the above-mentioned NOR gate before the glue logic actually knows if the address is, or is not, $00xxxx.
My timing analysis indicates that the “it won’t work” Ø2 threshold with this arrangement is around 18 MHz, using discrete 74AC or 74AHC logic. Allowing for variations in prop delay that are inevitable from one part to another, I projected that 16 MHz would be the practical Ø2 “ceiling,” something that has been borne out by real-world testing. So I needed to find another way.
A while back, I had mentioned the possibility of building POC V1.5 with discrete logic that could support 512 KB of RAM. I decided it would take too many gates (equating to soldering a lot of SOIC parts) and might not perform any better than V1.3, which is stable at 16 MHz. Given that, I have concocted a substantially different design that consolidates glue logic and bank bits latching in two GALs, appropriately named GAL1 and GAL2.
Here’s the memory map this new contraption will set up:
Although the “island” of RAM created at $00C400-$00CFFF will be globally accessible (there is no memory protection), my intended use for it will be to give the firmware and/or operating system a private storage area for its direct page, MPU stack and other needed work space. By doing so, the entire 48 KB of base RAM will be available for user programs, with the firmware or OS transparently remapping direct page and the stack during system API calls.
The I/O block will be mapped as follows:
Code: 00C000 — serial I/O ports ‘A’ & ‘B’, timer ‘A’ 00C080 — serial I/O ports ‘C’ & ‘D’, timer ‘B’ 00C100 — serial I/O IRQ status* 00C180 — real-time clock 00C200 — expansion port chip select ‘A’* 00C280 — expansion port chip select ‘B’ 00C300 — expansion port chip select ‘C’
*Not wait-stated.
The above I/O map is like that of POCs V1.2, V1.3 and V1.4, excepting for different addresses.
In order to make all this work, I’ve assigned the two GALs the following functions:
- GAL1 contains the address decoding logic and generates corresponding chip selects. It also tells the clock generator when to stretch Ø2 high for the purpose of wait-stating ROM and I/O accesses.
GAL1’s inputs are A7-A15, the aforementioned BNK0 (a low-true signal emitted by GAL2), VDA and VPA. This GAL uses only combinatorial logic, with no pins being used as feedback nodes. Hence it will respond to all input combinations in no more than tPD nanoseconds, tPD being the advertised pin-to-pin prop time of the device. - GAL2 serves multiple purposes:
- Latch A16-A18 during Ø2 low;
- Produce fully-qualified /RD (read data) and /WD (write data) control signals;
- Aggregate three separate IRQ inputs into a single IRQ output;
- Generate the BNK0 signal needed by GAL1 to make its logic decisions.
GAL2’s inputs are Ø1, D0-D2 (which are A16-A18 during Ø2 low), RWB, VDA, VPA, IRQA, IRQB and IRQC. This GAL uses both combinatorial and registered logic, but should still perform at tPD due to no pin feedback nodes being used.
As GAL1 depends on an output from GAL2 ( its !BNK0 signal), both devices need to be fast in order to support high Ø2 rates. My choices for the GAL are Microchip’s ATF22V10C-7PX, basically the venerable 22V10, but with better tPD, or their ATF750C-7PX, the latter which may be described as a 22V10 on steroids—the 750C has about 40 percent more gates that the 22V10. The two types are electrically interchangeable, but require somewhat different programming methods when using registered logic due to, among other things, the 750C’s more-flexible clocking.
Both parts have a 7.5ns tPD rating, which in my implementation, should be achievable due to no pins being used as feedback nodes. Hence the worst-case elapsed time from when the 816 emits an address to when a chip select has been generated will be 15ns, a level of performance that I cannot consistently achieve with the fastest equivalent discrete logic. 15ns total prop delay should result in stable operation at 20 MHz.
V1.5 also incorporates my second-generation, stretchable Ø2 clock circuit that is capable of at least 40 MHz operation. I’m shooting for a top speed of 20 MHz in V1.5, but may try to go faster if it works.
I should note that observations of a couple of ATF22V10Cs I have here indicate that these devices run about 25 percent faster than guaranteed. While that isn’t something that should be relied upon—minor changes in voltage and/or temperature could slow down things, it might mean that I could get POC V1.5 running above 20 MHz. The clock generator circuit is designed so an extended wait-state may be configured to handle above-20 MHz operation.
Attachment:
File comment: POC V1.5 Schematic
poc_v1.5.pdf [338.42 KiB]
Downloaded 38 times
Attachment:
File comment: Glue Logic Design Files
logic.zip [3.92 KiB]
Downloaded 38 times
Attachment:
File comment: Microchip 22V10 GAL
atf22v10c.pdf [1.87 MiB]
Downloaded 39 times
Attachment:
File comment: Microchip ATF750C “Super GAL”
atf750c.pdf [491.27 KiB]
Downloaded 39 times
_________________ x86? We ain't got no x86. We don't NEED no stinking x86!
|
|