I've posted some of this circuit before but I've now refreshed and fixed it up. It's not working 100% but is getting there and I thought it'd be good to give it its own thread, recap some of the background, show the circuit, and explain how it works and where some problems still seem to lie.
The Cypress FX2 EZ-USB logic analyser combined with hoglet's 6502 decoder (
https://github.com/hoglet67/6502Decoder ... -Analyzers) is great for capturing what a 6502 system is doing and debugging it. However at clock frequencies above maybe 12-16MHz, especially combined with clock stretching or use of RDY (including WAI and STP), it can be hard to get it to work, and I've been iterating on this adaptor circuit to help out with that.
It works fairly well at the moment but I still have some problems at 32MHz clock speed, which look like they're due to data loss between the logic analyser and the PC. About 75% of my captures seem to be fine, but maybe 25% of them have some form of error during the course of 200 million samples, that often turns out to be a correct-looking data stream but with a sequence of bytes missing. It's possible that some OS priority tweaks might resolve it, but equally this may just be the limit of what's possible over USB2 - I haven't fully diagnosed that yet.
Nonetheless, here's the circuit that I'm now using, with some explanation:
Attachment:
File comment: Logic analyser hat schematic (work in progress)
logicanalyserhat-schematic-partial.png [ 43.25 KiB | Viewed 3427 times ]
Dave has documented various ways to configure the FX2 that are appropriate for different clock rates. In particular, I'm focusing on clock speeds above about 16MHz, which require you to run the logic analyser in "Slave FIFO Synchronous Write" mode, driven by an external clock (e.g. PHI2), with 8-bit data captures (just the data bus, no control signals). This works because the logic analyser only captures one data bus sample per 6502 clock cycle, timed at the right point in the clock cycle, so there's no need to capture the clock. The decoder can reconstruct signals like RWB and SYNC, along with the whole address bus, in most cases.
There are further problems though if the 6502-based system uses clock stretching, or RDY - and note that WAI and STP rely on RDY. In the case of clock stretching, the issue is that the FX2 requires that the externally-provided clock runs at a rate of at least something like 5MHz, and it's very easy for a stretched clock to have either the high or low period of the clock be longer than what the FX2 can support. In the case of use of RDY the problem is that while the clock keeps ticking, the CPU is not progressing to the next cycle, and so the logic analyser captures unwanted duplicate data bus values during the wait period.
The solution to both problems is to provide the logic analyser with a consistent, unstretched, fast enough clock, but supply an extra signal to inhibit data bus capture on clock cycles where the CPU is paused - either because the CPU's clock is being stretched (held high) or because the CPU's RDY pin is low. I also blend in the CPU's RESET signal because it is tidier to also not capture data bus samples during the reset period - it means your decoded captures start straight from the reset with no garbage before it.
To do this we need to feed a signal into the FX2's SLWR (SLave WRite) pin (also known as RDY1). In the inverted clock mode we're using (--ifclk=xi) this appears to be active-low, though that conflicts with the documentation. SLWR has setup and hold time requirements (12.1ns and 3.6ns) compared to the IFCLK signal, so this decision needs to be made early enough to meet those. To simplify this, I am supplying a delayed or inverted clock signal to IFCLK, so that the sample is taken at a much later time (around the rising edge of PHI2) and I'm using an 8-bit D flipflop register to capture the data bus state until this point. In addition I'm using a D flipflop to capture the combination of RDY, RESET, and any clock-stretching signal, to provide the logic analyser's SLWR input, telling it whether to capture data this cycle or not. The delayed IFCLK is meant to allow this to meet the 12.1ns setup requirement of the FX2.
I used a 138 decoder to provide the input for that D flipflop, allowing a fairly arbitrary combination of the inputs to be chosen with different polarities. I want to only capture when RESET is high, RDY is high, and IO is low. IO is a signal used in my 6502 computer circuit to trigger clock stretching, and it changes while the input clock is high, making it work well for this purpose. Other 6502 computer designs will need to provide different inputs there, so the flexibility of the 138 decoder could be useful for them.
This is a timing diagram for my current setup, with IFCLK and DFFCLK driven by the closed jumpers shown in the schematic:
Attachment:
File comment: Timing diagram
logicanalyserhat-timing.png [ 4.5 KiB | Viewed 3427 times ]
Note that with my 6502 system PHI2 lags the input clock slightly (it's the 74AHCT32-OR of CLK with IO), and IO changes shortly after the input clock rises. The logic analyser's IFCLK is one inverter away from the input clock, so it is roughly 180 degrees out of phase with PHI2 in general. The D flipflops are being triggered three inverter delays after the input clock falls - this seems to work well in my case. This leads to LA_SLWR and LA_PB[0..7] changing around that point, and then it's important for IFCLK's falling edge (since we're using it inverted) to fall a good distance away from there. At 32MHz, half a clock cycle is 15ns which is just about enough to satisfy the 12.1ns setup time, but still a bit borderline.
So from the logic analyser's perspective, on the falling edge of LA_IFCLK, if LA_SLWR is low, then it samples LA_PB[0..7] and queues a byte to send over USB.
Now aside from the bandwidth issue, there are a few things that don't seem right to me, so I'm still investigating them. Firstly, SLWR is meant to be active-high if IFCLK is inverted, not active-low - the diagrams in the datasheet show this clearly. Actually I strongly suspect that --ifclk=xi is causing the sampling to take place on the rising edge of the IFCLK I'm supplying, not the falling edge - i.e. the sense of the inversion in the --ifclk parameter is backwards. If I swap it for --ifclk=x, i.e. not inverted, then I need to also invert SLWR (to be active-high) for it to work. Also, with --ifclk=xi, having IFCLK and DFFCLK driven by the same inverter - so they rise at the same time as each other - gives very bad results, when that ought to be more or less ideal as with =xi it's meant to sample on the falling edge. Switching to --ifclk=x in that same connfiguration gives very good results, when it should be as bad as it gets as it should then be sampling on the rising edge of IFCLK and violating all the setup/hold times.
On the bandwidth issue, I'm considering rigging up a test case that's not built around a 6502 computer just to test the capture rate and fidelity - this worked well for me before, and it's quicker to find errors in a simpler data stream, and understand where they came from. I would really like this to work at least up to 40MHz, so that's probably the next priority for me!