I can't see anything wrong with that three-counter circuit at the moment, but I'll build it out when I have time, it'll be relatively easy to change my current one over.
For reference, here are some waveforms from my current build (the two-counter divide-by-4 circuit):
In these captures the bottom yellow line is PHI2, next up in blue is the slow clock (5 MHz), then in red the /WSE signal, then at the top in green is the ROM's /CS signal. The input clock (not shown) is 40 MHz.
Attachment:
File comment: ROM and RAM operations
20231128_232810.jpg [ 1.22 MiB | Viewed 1421 times ]
The above shows a mixture of ROM and RAM operations - the code is in ROM here. In the normal case of running from RAM almost all the cycles would be fast and the red line would be mostly high all the time.
From the left there's a ROM cycle in progress which ends when PHI2 dips low, in sync with the slow clock falling. Next is another ROM cycle - note that it waits for a fresh falling edge on the slow clock (second trace up) before activating the ROM (top trace going low). Then there are two fast RAM cycles in a row, then another two ROM cycles. The first of these didn't need to wait as long as the second one. Then there are three fast cycles, etc.
Attachment:
File comment: ROM and VIA operations
20231128_232847.jpg [ 1.12 MiB | Viewed 1421 times ]
In this second capture we see only slow operations. Note that overall they only run at half of the slow clock rate, i.e. 2.5 MHz - I don't know why the oscilloscope reported it as 5 MHz. This hslving is because the requirement to wait for a falling edge of the 5 MHz slow clock is rather wasteful, we lose up to a whole cycle every time. Some additional logic could allow us to not wait for quite as long, and would then allow code in ROM to run twice as quickly, and generally almost halve cost of slow cycles. It is not a big deal though as they should be rare in practice.
The code running here was polling for I/O, so there are some slow cycles where the ROM's /CS wasn't asserted - in these, the VIA's CS would have been asserted instead, but I didn't think to capture that.