Page 2 of 5

Re: W65C816 overclock experiment

Posted: Mon Jul 19, 2021 10:40 pm
by GARTHWILSON
Today's 816's are made with a finer geometry than what they had back when the SuperCPU was being made, and I fully expect it will perform faster, and the WDC could increase the speed rating if they wanted to. But again, Bill Mensch asked me why hobbyists aren't picking up the 65265 much, and I said it was undoubtedly largely because the speed rating is so much lower. He responded, "Well, it may very well go a lot faster, but we just never tried it." I don't understand that logic, since hiding virtues is not the way to make sales.

Re: W65C816 overclock experiment

Posted: Mon Jul 19, 2021 10:49 pm
by Dr Jefyll
Dr Jefyll wrote:
Maybe [a memory that needs protection from dummy cycles] exists and I just haven't heard about it yet.
BigDumbDinosaur wrote:
Slow memory might have a problem with the effects of a momentarily "bad" address on the bus that causes selection of one cell, followed in very rapid succession by the selection of a different cell
Momentarily bad addresses are routine; they can occur early in any Phi2-low period. So, if the memory you mention has trouble with dummy cycles then it'll also have trouble with non-dummy cycles, whose timing is just the same. As I noted above, "The dummy aka dead aka invalid cycles are proper, fully formed bus cycles. It's just that their actions are not productive."

So, thanks for the remark but I don't feel it sheds any light on WDC's recommendation that VDA and VPA should be used to qualify all memory cycles. I'll bet that recommendation is just an instance of sloppy or at least overly-cautious, CYA doc. But caution isn't a bad thing, and I hate losing bets :) so I'll stop short (but just barely) of saying, "All memories are immune to dead cycles."

[Edited to insert a more detailed elaboration:]

Alright, so, are all memories are immune to dead cycles? If any exceptions exist, here's the identifying feature. Earlier I mentioned that some I/O devices don't simply take a read or write at face value. For example, when you read the Interrupt Flag Register of a 6522 VIA, the chip does read the IFR. But, as an "extra," it also goes one step further and resets any IFR bits which were set. That's why havoc can result if a VIA's IFR should happen to get touched by an unintended read such as that which might result from a dummy cycle. Reading the IFR carries a meaning beyond its face value. The datasheet mentions this.

In the realm of memory, is there anything comparable? ie, is there any kind of memory which, as a legitimate feature, attaches extra, implicit significance to being accessed? I've pondered at length; but, try as I might, I only came up with two candidates, as follows:

Some EEPROMs require a certain period to complete a write operation, and they include a feature that allows the CPU to determine when the period has completed. If the CPU reads back the data just written, the EEPROM deliberately returns incorrect data if the time delay is still in progress. All the CPU has to do is repeatedly read the device until correct data appears. As memory behavior goes, this is kinda wonky, but the scheme is effective. But can it tolerate an extra, unintended read resulting from a dummy cycle? With a moment's thought it becomes clear the answer is yes.

Less clear is the case of a certain category of specialized memories which have paging features built in. I haven't used these, but it's my understanding that manipulation of the features involves a specific "knock" sequence. That is, the memory performs normally but also is designed to "listen" for a certain sequence of accesses which, if detected, will update the paging feature. The question arises, could an unintended access from a dummy cycle corrupt the sequence and prevent its recognition? I don't know, but certainly this device can't accidentally be mistaken for an ordinary, everyday memory. If you read the datasheet you'll surely realize there's some "more than face value" significance to an access.

Now I've given the background that explains my (cautious) advice, already posted upthread: Just do a quick sanity check. "Will a fully formed but unexpected access as described here cause trouble with the memory I'm using?" If the memory you're using has any unusual, "more than face value" features they'll surely be detailed at length in the datasheet.

Now. Surprisingly, perhaps, the "should I protect" determination is the same for I/O as it is for memory -- it's just that the "more than face value" features are comparatively common in I/O devices... far more so than in memory devices, where they're very rare (and, as we've seen, may not constitute a vulnerability anyway). Here are the cases in which you needn't worry about protecting a memory or I/O device from dummy accesses during dead cycles:

- the device has no features that attach extra, implicit significance to an access
- it does, but you've been able to satisfy yourself that no problem will result.

In all other cases you'll want to protect the device from dummy accesses, and the question becomes "how can I protect"? It's worth mentioning that problems can be avoided simply by observing certain coding practices (and, by necessity, legions of 6502 users got by in this fashion). But, protection based on the 65816's VPA and VDA signals involves keeping the vulnerable device disabled during dead cycles (while of course also ensuring it is enabled during cycles which require it to respond).

Presumably any access to an I/O device will pertain to data (not code), and that makes protection pretty simple. Just arrange things so the device is only enabled when VDA is high. (See the diagram and The Truth Table here.) And if the I/O device doesn't require protection, just ignore VDA.

But a memory device can contain both data and code, which means it must be enabled if VDA goes high or if VPA goes high, or both. Again, refer to the Truth Table. And if the memory device doesn't require protection (this will be all of them except perhaps a tiny minority with wonky features) just ignore VDA and VPA.


-- Jeff

Re: W65C816 overclock experiment

Posted: Mon Jul 19, 2021 11:10 pm
by ThisWayUp
GARTHWILSON wrote:
Bill Mensch asked me why hobbyists aren't picking up the 65265 much, and I said it was undoubtedly largely because the speed rating is so much lower. He responded, "Well, it may very well go a lot faster, but we just never tried it." I don't understand that logic, since hiding virtues is not the way to make sales.
I just looked over the datasheet for that chip and as the definition of a hobbyist (unlike most of you big brained experts) it seemed like they just dumped the info onto paper and hit send. It sounds like a great chip and the dev board is cool but that datasheet.... good lord. The 6502 datasheet is brilliant.

Re: W65C816 overclock experiment

Posted: Mon Jul 19, 2021 11:32 pm
by BigDumbDinosaur
GARTHWILSON wrote:
rpiguy2 wrote:
65C815 accelerators for the C64 and Apple IIGS seemed to top out at around 18-20mhz
The SuperCPU, which used the '816 and plugged into the back of the C64, ran at 20MHz (capital M, ie, megaHertz, not milliHertz), and apparently CMD was able to get consistent-enough performance to run a business on it. If they had to reject any because they wouldn't run reliably at that speed, it must have been very few. I haven't heard; but regardless, it's impressive.
Historical note: the SuperCPU came out after WDC had converted the 65C02 and 65C816 to static cores.

Re: W65C816 overclock experiment

Posted: Tue Jul 20, 2021 1:50 am
by plasmo
Powered up the overclock board with a W65C02 and a simple ROM program that does 64K delay loops then put a character out to serial port and repeat. I can execute it successfully to 33MHz at nominal 5V. It also runs correctly down to 4.5V (reset supervisor threshold is 4.5V) and up to 5.5V at room temperature (actually my lab is warmish in the afternoon). It will not run correctly at 34.5MHz. The W65C02 is from the same batch of 10 from Mouser where other parts ran reliably at 29.5MHz in CRC65. The part number is W65C02S6TPG-14, so I assume it is 0.6u technology? The 65816 I ordered from Mouser is W65C816S6PG-14 so I assume it is also 0.6u technology. We'll see how fast it can go.
Bill

Re: W65C816 overclock experiment

Posted: Tue Jul 20, 2021 5:36 am
by BigDumbDinosaur
plasmo wrote:
I can execute it successfully to 33MHz at nominal 5V.

That's quite impressive. I recall when a 33 MHZ 80386 was considered to be the ne plus ultra in the computing world. :D Now we have a modest 65C02 running as fast.

Quote:
The part number is W65C02S6TPG-14, so I assume it is 0.6u technology?

Correct, and with a TSMC die, which is the most recent iteration.

Quote:
The 65816 I ordered from Mouser is W65C816S6PG-14 so I assume it is also 0.6u technology.

Also correct. It has a Sanyo die, which predates the TSMC product. I was able to run an 816 with a Sanyo 0.6µ die at 20 MHz in POC V1.2. I didn't attempt to go faster, so it should be interesting to see what your results will be.

Re: W65C816 overclock experiment

Posted: Wed Jul 21, 2021 6:18 pm
by plasmo
I've done more CPLD redesign and ROM programming to check out W65C02. I'm pretty sure the current set of hardware can run reliably at 33MHz.

DHL will deliver several lots of pc board from JLCPCB today so I'm setting aside the overclock experiment for a few weeks. Here is OVRCLK65 homepage as placeholder for the current design files: https://www.retrobrewcomputers.org/doku ... o:ovrclk65

Bill

Re: W65C816 overclock experiment

Posted: Thu Jul 29, 2021 2:47 am
by J64C
I know it's not the W65C816, but I presently have my W65C02 running in front of me right now at 31.65 MHz at 5V!

Certainly makes me more comfortable with designing my proper boards later on. If I can run it at 32 MHz on is debarcle of a layout, it has got to be almost impossible to fail on my 12.5 MHz target.

[Edit]
33.78MHz now! But I think it's he driving 74HC00 that is causing the limit here. The occasional droop in the clock wave at that speed.

Re: W65C816 overclock experiment

Posted: Thu Jul 29, 2021 3:21 am
by ThePhysicist
what type of ROM are you using at 33MHz?

Re: W65C816 overclock experiment

Posted: Thu Jul 29, 2021 3:28 am
by J64C
ThePhysicist wrote:
what type of ROM are you using at 33MHz?
I'm not. :D

I'm preloading the RAM (12nS) which takes about a quarter of a second for the full 64K, resetting it, and letting it run from there.

Makes updating ROM's a literal 30 second task. Makes for extremely fast changes in desired operation.

Re: W65C816 overclock experiment

Posted: Thu Jul 29, 2021 10:45 am
by plasmo
Thank you for the independent confirmation that W65C02 can run to 33+MHz. Could you share the source and marking of your W65C02 part?

The "ROM" of my overclocking experiment board is a lookup table in CPLD. It is only 64 bytes in size. The board only has 2 parts: W65C02/816 and CPLD. One or more RAM is need for it to be useful.
Bill

Re: W65C816 overclock experiment

Posted: Thu Jul 29, 2021 11:17 am
by J64C
Got two through Mouser which are marked as W65C02S6TPG-14.

Presently the core of my testbed is the W65C02 and the RAM which is a 128K IS61C1024AL-12JLI. Top address pin is held low, so it's only seeing the lower 64K.

The RAM is getting pre-populated by a Raspberry Pi Pico, which then goes High-Z after the transfer essentially just leaving the Clock, CPU, and RAM.

For the clock, I was running another Pi Pico firing at whatever frequency I programmed it to. For low-ish clock speeds (maybe ~15 to 20 MHz - can't remember), driving the PHI2 pin was fine, but anything higher than that, I had to route through a 74HC04, because the Pico's GPIO pins are only 3.3v and wasn't cutting it anymore. The 74HC04 gave it a nice 5v wave again.

At the end of the day I was able to squeeze it to 34MHz. As long as my little homemade analyser was sitting in the low $D000 area, I knew I was good. But, look at it sideways at that clock rate and it crashes and starts trying to execute code from all over the address space. :lol:
E7bvHcdVgAYsHNY.png
That's my personal best right there. At that rate I could see the occasional droop in the wave and more often than not, that's when the program would crash.

So my belief is that the clock is letting it down at that speed. I reckon the CPU still has more in it. And this is all still at 5V. 40MHz might realistically be in reach.

When you see the un-optimised stripboard layout and wires I have hanging from this thing, it really makes me wonder how fast these things could scream along in a decent environment.

Re: W65C816 overclock experiment

Posted: Thu Jul 29, 2021 11:39 am
by drogon
J64C wrote:
I know it's not the W65C816, but I presently have my W65C02 running in front of me right now at 31.65 MHz at 5V!

Certainly makes me more comfortable with designing my proper boards later on. If I can run it at 32 MHz on is debarcle of a layout, it has got to be almost impossible to fail on my 12.5 MHz target.

[Edit]
33.78MHz now! But I think it's he driving 74HC00 that is causing the limit here. The occasional droop in the clock wave at that speed.
This is great to read about.

My Ruby boards (both 65C02 and 65C816) have been running at 16Mhz for a very long time now (years) without any issues, so 12.5Mhz for VGA timing ought to be trivial (especially since the chip is 'rated' 14Mhz!) The limit for me is the RAM chip - it's a 55ns part so very much overclocked at 16Mhz, however I have tried 4 different ones of the same type and all "just worked".

My boards are just double sided and I sort of tried to keep the clock lines short, but wasn't overly worried about it (I had it going on stripboard at 16Mhz too with the clock wire literally flying all over the board at one point with the same can osc. feeding both the 65C02 and the ATmega.

But if you can get it reliable at VGA 25.xMhz then that would work well for transparent odd/even clock interleaving for VGA - the only issue then may be RAM vs. resolution which has always been my concern - essentially the limit being the clock cycles it takes to simply plot a pixel, and making life "easy" (ie. no bit masking) with one byte per pixel still needs 76KB of RAM for 320x240 pixels and clearing that at 7 cycles per byte takes half a million cycles - which is noticeable even at 25Mhz... So it's easy to see why there was a move to hardware assist (sprites, 'blit', etc.) back "in the day" - even in new designs today with e.g. the Foenix 816 board are using FPGAs to manage the video...

At 640x480 it's 300KB of RAM for 8bpp, or 38KB for 1 bpp.

Quite an ambitious project though.

Cheers,

-Gordon

Re: W65C816 overclock experiment

Posted: Thu Jul 29, 2021 12:20 pm
by BigEd
J64C wrote:
That's my personal best right there. At that rate I could see the occasional droop in the wave and more often than not, that's when the program would crash.
What kind of test program is it? Two things I'd be inclined to try: Klaus' test suite on the one hand, and a Basic interpreter running a benchmark on the other hand. (When working with PiTubeDirect, we often use the attract mode of the game ELITE as a test. We also use the WOOLBAL or SPHERE demo, which is a Basic program which does trig and draws lines. Both of those need a graphics display, of course.)

Re: W65C816 overclock experiment

Posted: Thu Jul 29, 2021 6:19 pm
by BigDumbDinosaur
J64C wrote:
Got two through Mouser which are marked as W65C02S6TPG-14.

That is the most recent production. The 6T in the part number means it has a die made from a 0.6µ geometry TSMC wafer.

Quote:
For low-ish clock speeds (maybe ~15 to 20 MHz - can't remember), driving the PHI2 pin was fine, but anything higher than that, I had to route through a 74HC04, because the Pico's GPIO pins are only 3.3v and wasn't cutting it anymore. The 74HC04 gave it a nice 5v wave again.

The 74HC04 is likely not helping you as much as you'd like. Try a 74ACT04 and see if you gain some stability at the high speeds. The HC04's output transition speed isn't up to the WDC specs of 5ns or less.

If you do try the 74ACT04 make the connection from its output to the 65C02's Ø2 input as short and direct as physically possible. If ringing seems excessive try inserting some resistance in series with the connection, with the resistor (metal film) also as physically close to the 74ACT04 as possible. 100 ohms is a good starting point.

—————————————————————————————
EDIT: For some reason, I had ±5ns. 5ns is 5ns. :D