Speeding up the 65C02

For discussing the 65xx hardware itself or electronics projects.
ElEctric_EyE
Posts: 3260
Joined: 02 Mar 2009
Location: OH, USA

Post by ElEctric_EyE »

I didn't realize it at the time, but this is where (as far as this thread is concerned) address decoding met a fork in the road. Refresh here: viewtopic.php?t=1503&postdays=0&postorder=asc&start=33 .

Does one use Phase 2 with address lines(A0-A15) or Phase 2 with R/W to qualify a valid memory access? I bring it up now because there's a link to schematics of older arcade machines here: ( viewtopic.php?t=1609&postdays=0&postorder=asc&start=7 )...

I was young in the days of the original Vic-20/C-64. I had a friend. His father and mine worked at SSS (Solid State Scientific, now Allegro), both EE's. We were always exchanging ideas... Anyway, I see now where I/we got the idea to use Phase 2 with Address decoding and it has persisted to this day...
The original (version e) Vic-20 schematic uses A0-A15 & Phase 2 to qualify memory decoding.

Compare to Battlezone for example. Focus on A0-A15, Phase 2 & R/W...

Most of the arcade games were based on 80XX/Z80, some 68K, and yet a few on 6502/A. ALL of the PRO's (i.e Atari, etc.) incorporating 6502's, did in fact qualify R/W with Phase 2.

Edit: links, weak links edited out
Last edited by ElEctric_EyE on Sat Aug 14, 2010 4:41 am, edited 4 times in total.
kc5tja
Posts: 1706
Joined: 04 Jan 2003

Post by kc5tja »

I think it depends on the needs of the bus.

The 6502 is perfectly happy qualifying R/W with PH2, providing you don't have any other peripherals using the bus during PH1. :-)

The 65816 isn't so nice about this, regrettably, since A16-A23 is only available during PH1. Hence, your address isn't complete until PH2, and all address decoding must be done during PH2 if you intend to use more than 64K of address space.
User avatar
BigDumbDinosaur
Posts: 9428
Joined: 28 May 2009
Location: Midwestern USA (JB Pritzker’s dystopia)
Contact:

Post by BigDumbDinosaur »

kc5tja wrote:
The 65816 isn't so nice about this, regrettably, since A16-A23 is only available during PH1. Hence, your address isn't complete until PH2, and all address decoding must be done during PH2 if you intend to use more than 64K of address space.
I'd disagree with that. The '816 timing diagram (page 29 in the data sheet) clearly shows that a full address is valid before the rise of Ø2. The bank address is present on D0-D7 no more than 3ns after A0-A15 become valid (at 14 MHz). Assuming sufficiently fast silicon is used to latch the bank address, you will have a fully qualified address before the rise of Ø2. If you haven't latched A16-A23 by the time Ø2 rises you will probably end up with a big boo-boo when the MPU tries to access memory or I/O.

in my POC design, I don't use A16-A23, but I have carefully studied the relationship of A0-A15 relative to Ø2 and can see (on my dual trace scope) that the address bus is valid well before Ø2 goes high.
x86?  We ain't got no x86.  We don't NEED no stinking x86!
kc5tja
Posts: 1706
Joined: 04 Jan 2003

Post by kc5tja »

BigDumbDinosaur wrote:
I'd disagree with that. The '816 timing diagram (page 29 in the data sheet) clearly shows that a full address is valid before the rise of Ø2.
I have an old data sheet then. Good to know that it's more aggressive about putting the address on the bus. There is still that 3ns opportunity for a glitch though, so address-triggered I/O (as is used in the Amiga's AGNUS chip to launch the blitter) still needs PH2 qualification. Otherwise, it sounds like you can still get away with plain old R/W qualification.
User avatar
BigDumbDinosaur
Posts: 9428
Joined: 28 May 2009
Location: Midwestern USA (JB Pritzker’s dystopia)
Contact:

Speeding up the 65C02

Post by BigDumbDinosaur »

kc5tja wrote:
BigDumbDinosaur wrote:
I'd disagree with that. The '816 timing diagram (page 29 in the data sheet) clearly shows that a full address is valid before the rise of Ø2.
I have an old data sheet then. Good to know that it's more aggressive about putting the address on the bus. There is still that 3ns opportunity for a glitch though, so address-triggered I/O (as is used in the Amiga's AGNUS chip to launch the blitter) still needs PH2 qualification. Otherwise, it sounds like you can still get away with plain old R/W qualification.
In my design, all read/write ops are qualified by Ø2, so the glitch you refer to wouldn't happen. Also, to prevent glitches with I/O hardware (the DUART, in particular, which runs asynchronously to the MPU clock), I use the MPU's VDA and VPA outputs to qualify address decoding. Address decoding occurs as soon as the MPU indicates (via the aforementioned signals) that the address bus is stable. This occurs during the final cycle of an instruction while Ø2 is low.

I plan in the next iteration to generalize the VDA/VPA qualification to all hardware, although testing with the current design doesn't indicate there's any particular problem in that regard with RAM and ROM access. BTW, when both of these signals are high, they reflect what SYNC means on the 65C02, making it possible to set up single step circuitry if desired.

Another signal I'm going to play around with is *ABORT. Owing to the way the MPU behaves in response to *ABORT, I believe it should be possible to implement hardware-based memory protection.
x86?  We ain't got no x86.  We don't NEED no stinking x86!
fachat
Posts: 1124
Joined: 05 Jul 2005
Location: near Heidelberg, Germany
Contact:

Re: Speeding up the 65C02

Post by fachat »

BigDumbDinosaur wrote:
Another signal I'm going to play around with is *ABORT. Owing to the way the MPU behaves in response to *ABORT, I believe it should be possible to implement hardware-based memory protection.
Yes, I understand that's what it has been designed for. For my 6502 hardware-based memory protection I needed a separate 6502 to take over the bus and correct the memory situation (like remapping some memory pages in the MMU) before the main CPU could be started again.

See http://www.6502.org/users/andre/csa/auxcpu/index.html I'm still pretty proud that this one worked in a 1.0 version :-)

André
User avatar
BigDumbDinosaur
Posts: 9428
Joined: 28 May 2009
Location: Midwestern USA (JB Pritzker’s dystopia)
Contact:

Re: Speeding up the 65C02

Post by BigDumbDinosaur »

fachat wrote:
See http://www.6502.org/users/andre/csa/auxcpu/index.html I'm still pretty proud that this one worked in a 1.0 version :-)

André
I'm pretty amazed that it actually worked at all. It's not something you'd expect from a 6502. :)
x86?  We ain't got no x86.  We don't NEED no stinking x86!
User avatar
BigDumbDinosaur
Posts: 9428
Joined: 28 May 2009
Location: Midwestern USA (JB Pritzker’s dystopia)
Contact:

Lattice PLDs

Post by BigDumbDinosaur »

8BIT wrote:
The Lattice brand GAL's program fine in my programmer.

Mouser has 10ns, 24 pin GAL22V10D PDIP's for $5.50

Part # is 842-GAL22V10D10LPN

Daryl
Mouse still has these in stock, but all Lattice GALs are EOL, so grab 'em while you can.
x86?  We ain't got no x86.  We don't NEED no stinking x86!
User avatar
GARTHWILSON
Forum Moderator
Posts: 8775
Joined: 30 Aug 2002
Location: Southern California
Contact:

Post by GARTHWILSON »

Regarding the fear that the addressed device's output data will disappear too soon for the processor to read it after it is disabled when φ2 went down, I wrote:
Quote:
For one, his glue logic takes most of that 10ns, and for another, bus capacitance will easily hold the data for the rest of the time, as nothing else is driving the data bus yet. The small CMOS leakage along with the capacitive loading on the lines would hold the logic state for a good dozen microseconds (and possibly much longer) if you were to stop the clock immediately after phase 2 goes down.
In testing my tester for the 4Mx8 10ns 5V SRAM module a couple of weeks ago (data sheet available here), I found, by accident, that the bus capacitance held the data reliably for over a millisecond (a million nanoseconds) in the absense of anything driving the bus. I did not experiment to find out how much longer it could go, like a whole second or what. Anyway, it definitely was not collapsing after 10ns, or even 10µs.

http://WilsonMinesCo.com/
User avatar
BigDumbDinosaur
Posts: 9428
Joined: 28 May 2009
Location: Midwestern USA (JB Pritzker’s dystopia)
Contact:

Bumping Somewhat Dormant Topic

Post by BigDumbDinosaur »

GARTHWILSON wrote:
In testing my tester for the 4Mx8 10ns 5V SRAM module a couple of weeks ago (data sheet available here), I found, by accident, that the bus capacitance held the data reliably for over a millisecond (a million nanoseconds) in the absense of anything driving the bus. I did not experiment to find out how much longer it could go, like a whole second or what. Anyway, it definitely was not collapsing after 10ns, or even 10µs.

http://wilsonminesco.com
Yes, but where is the bulk of this bus capacitance coming from, the memory module itself or the test rig?
x86?  We ain't got no x86.  We don't NEED no stinking x86!
User avatar
GARTHWILSON
Forum Moderator
Posts: 8775
Joined: 30 Aug 2002
Location: Southern California
Contact:

Post by GARTHWILSON »

Quote:
Yes, but where is the bulk of this bus capacitance coming from, the memory module itself or the test rig?
It shouldn't matter, but the majority of it would be on the memory module. The tester has shift-register outputs (plus a shift-register input for reading data) feeding the memory-module socket plus three other empty sockets, by way of short wire-wrap wires. The longest wires might be as much as 2".
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?
User avatar
BigDumbDinosaur
Posts: 9428
Joined: 28 May 2009
Location: Midwestern USA (JB Pritzker’s dystopia)
Contact:

Bumping Somewhat Dormant Topic

Post by BigDumbDinosaur »

GARTHWILSON wrote:
Quote:
Yes, but where is the bulk of this bus capacitance coming from, the memory module itself or the test rig?
It shouldn't matter, but the majority of it would be on the memory module. The tester has shift-register outputs (plus a shift-register input for reading data) feeding the memory-module socket plus three other empty sockets, by way of short wire-wrap wires. The longest wires might be as much as 2".
Not to belabor the subject or anything...it would be interesting to see how much bus capacitance would vanish with an edge-connected PCB. All those pins in close proximity to each other have to be adding mucho capacitance (plus a wee bit of inductance). The socket itself would have some capacitance, aided and abetted by the (probably high) dielectric constant of the socket material. As I mentioned in another post, the industry got away from SIP memory modules many years ago. While the lower cost of edge-connected PCBs had to have been a significant factor in this change, reduced stray capacitance was probably considered as well, especially as memory speeds were being jacked up.

Something to think about is this: Garth's memory module has 4 Mb x 8 of RAM. A maximized 65C816 system in which A16-A23 is selected via the MPU's multiplexed bank address can use 16 MB, which would require the installation of four modules. Ergo capacitive bus loading will increase by at least a factor of 4, most likely more due to the added length of the motherboard address and data bus traces needed to connect the sockets. The module uses Cypress CY7C1049D DRAMs, each of which has an 8 pf loading spec—Cypress doesn't state if this number varies when the device is selected. There are eight SRAMs per module, so aggregate loading of a maximized system of four modules would be 256 pf. Would the '816 even be able to drive that loading, regardless of the Ø2 rate?
x86?  We ain't got no x86.  We don't NEED no stinking x86!
User avatar
GARTHWILSON
Forum Moderator
Posts: 8775
Joined: 30 Aug 2002
Location: Southern California
Contact:

Re: Bumping Somewhat Dormant Topic

Post by GARTHWILSON »

BigDumbDinosaur wrote:
it would be interesting to see how much bus capacitance would vanish with an edge-connected PCB. All those pins in close proximity to each other have to be adding mucho capacitance
A twisted pair of wire-wrap wires has 3/4 of a pF per inch. These connector pins are much farther apart than WW wire's insulation puts the WW wires, and the ratio of pin diameter to pin separation is much greater, so I am confident the pins and socket are contributing well under a pF. Compare that to 64pF per board of each data or address or control line, and you see that the connector capacitance is totally insignificant.

Quote:
(plus a wee bit of inductance).
That will depend mostly on the length of the connections, which is almost exactly the same whether with the pins or with a board-edge connector of the C64 type.

Quote:
As I mentioned in another post, the industry got away from SIP memory modules many years ago. While the lower cost of edge-connected PCBs had to have been a significant factor in this change, reduced stray capacitance was probably considered as well, especially as memory speeds were being jacked up.
I'm sure that besides avoiding the cost of installing a pin header, a big part of the goal was to get a shorter connection, getting the module's parts closer to the mother-board PCB; and the resulting socket is not perboard-friendly like it has to be for hobbyists who are not getting PC boards made.

Quote:
Something to think about is this: Garth's memory module has 4 Mb x 8 of RAM. A maximized 65C816 system in which A16-A23 is selected via the MPU's multiplexed bank address can use 16 MB, which would require the installation of four modules. Ergo capacitive bus loading will increase by at least a factor of 4, most likely more due to the added length of the motherboard address and data bus traces needed to connect the sockets. The module uses Cypress CY7C1049D DRAMs, each of which has an 8 pf loading spec—Cypress doesn't state if this number varies when the device is selected. There are eight SRAMs per module, so aggregate loading of a maximized system of four modules would be 256 pf. Would the '816 even be able to drive that loading, regardless of the Ø2 rate?
If the 65816's drivers are the same ones they put on the 65c22, they would look about like a 43-ohm resistor before going into current-limiting at about 50mA, and the time constant of 43 ohms times 256pF is 11ns. A single module however gives 64 times as much RAM as the C64 had, 32 times as much as the original Mac had, and I really don't expect anyone to be using more than one or two. The time constant for one would be less than 3ns.

Daryl had no trouble running it at 12MHz with a barefoot '816 (ie, no bus trasceivers), driving this module and three daughter boards at the same time, which I was pleased to hear.

Edit: More info on measured capacitance of the board and connectors at viewtopic.php?f=4&t=2172&start=6 . (Take the "&start=6" off the end to see the posts leading up to it.
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?
Post Reply