Hi, guys, and welcome, Dominic.
I like Ed's idea of a short list of hypotheses. Here's my (half-baked) notion of how that'd work.
Starting at a random position in a data stream we know nothing... except that no instruction exceeds 7 cycles, and therefore a Sync cycle is guaranteed to appear within that amount of time or less. That means as many as 7 hypotheses might conceivably come under consideration simultaneously.
So, when we capture the first frame of data (the address, data & control bits Dave listed) we launch a hypothesis that this cycle (call it Cycle A) is an opcode fetch. This becomes a thread (so to speak) that persists. On all subsequent cycles our software attempts to DIS-prove the notion that Cycle A was an opcode fetch. If it does get disproven then the thread dies; otherwise it continues to exist (albeit in limbo, as it hasn't been proven, either). On the next cycle we capture another frame of data and launch a hypothesis that
this cycle is an opcode fetch. We also examine all other surviving threads/hypotheses and attempt to disprove them, based on the new data just input.
And so it goes. It'll initially be the case that several hypotheses are viable -- and that'll be the best we can do until more data arrives. But with luck we'll eventually have only one surviving hypothesis, and of course that's the goal.
Assuming I haven't missed anything,
the challenge reduces to finding all the ways that a hypothesis can be disproven. For example, 65xx behavior dictates that if indexing causes a page crossing then the extra cycle will re-use the previous cycle's address except with $100 added (the high byte is incremented). We only have access to A[6:0], but if any of those are seen to change then whatever just happened wasn't a page crossing, and any hypothesis expecting a page crossing on that cycle has been disproven. That's just one example. Hopefully we'll find a LOT of ways to disprove hypotheses.
So, yes, I'm suggesting an exhaustive simulation of all the hypotheses -- which is hardly impossible, but it's something you'd want to plan carefully before you start. As an aside, performance needn't be badly impacted overall since most of the simulations will soon be deleted when their hypotheses are disproven.
Regarding interrupts, that's an aspect that launches an extra hypothesis for every cycle that might be a sync cycle. Luckily these extra hypotheses will tend to be short-lived. As I mentioned in my
previous post, the address bus will NOT increment on the 2nd cycle of interrupt recognition; thus if address line A0 (or any other address line) is seen to change, or if the byte on the data bus changes, then it kills the idea that we're in the 2nd cycle of interrupt recognition.
Anyone see a flaw with the details I've laid out? Or maybe the whole approach can be replaced with something better. Anyway, I've said my bit so I'll sign off for now
-- Jeff
ETA: An indexed data fetch might or might not have a page crossing. And a conditional branch has
three possible outcomes (no branch, branch, and branch with page crossing). So, should these opcodes launch more than one hypothesis? That's the most logical solution. But it pushes up the maximum number of hypotheses that might simultaneously in flight. Maybe that's acceptable.