I decided I'm sticking with the name "base-page", for one it seems to me like a natural naming evolution from "zero-page", but also I found in the 65CE02 datasheets they actually refer to this style of addressing mode as "base-page" as well! (
http://archive.6502.org/datasheets/mos_65ce02_mpu.pdf) I feel pretty good about this addition, and though I know more now about ISA layout and what would make better instructions to add, I'm going to lock into my current design and deal with what I've made - I think it'll be a fun experiment, and I don't think I'd be too particularly limited.
Some new thoughts that I'd like to seek feedback on, though:
Different Functionality for WAI InstructionI changed how WAI works (originally had this concept named WFI, or "Wait For Interrupt", but renamed to WAI to match the WDC 65C02). How I understand WAI works in something like the 65C02 is that it'll effectively block where it might've once considered the instruction complete (SYNC point), and wait for an interrupt so that it may be handled immediately (to avoid being in a state where we may not be in a SYNC point due to an infinitely branching loop or something to simulate a wait). One feature of this is that it allows maskable IRQs to interrupt the WAIT, but if the IRQ is masked we will just continue from where we were in code (aka a "super-fast interrupt"). The way I have constructed WAI is that it's basically BRK, but it pauses right before going to sample the interrupt vector until one of the internal interrupt-pending flip-flops is brought high, at which point we stop waiting and continue executing the WAI which shortens the time-to-respond of an interrupt.
A side-effect of this change is that this 1-cycle "super fast" masked IRQ interrupt is not possible, a masked IRQ in this model would just literally be a masked IRQ, nothing other than an actual interrupt that we would expect to be handled would allow execution to continue. The benefit of this model is that we speed up the path for actually handling an interrupt though by doing everything we can do while we wait for an interrupt to come in. So, for example, we can push elements onto the stack (PC and PS), and then wait to fetch from the interrupt vector until we know what we want to fetch.
The logic looks something like this:
1. Prepare ADH (0x01)
2. Move S into BI and 0xFF into AI to process stack decrement, use current S as ABL to write PCH.
3. Move ALU result back into AI to process next stack decrement, use new S as ABL to write PCL.
4. Move ALU result back into AI to process next stack decrement, use new S as ABL to write PS.
At this point we block on PHI2 (NRDY high) until an interrupt passes through to the internal pending flip-flops. Then:
5. Takes one cycle to allow the pending flip-flop to propagate from the pin input, promote pending interrupts to processing interrupts and continue.
6. Receive Interrupt Vector Low (depending on which flip-flop is promoted to processing).
7. Receive Interrupt Vector High (depending on which flip-flop is promoted to processing), at this point we can construct full address that we should update PC/AB to read the next instruction.
8. AB read propagates from DB to PD, we are in a good state to continue executing now.
This means we immediately handle the interrupt, and responding to an interrupt only takes 4-cycles (as opposed to 7-cycles + N-cycles due to SYNC offset for NMI/IRQ since we cannot interrupt mid-instruction for those). However, with this mechanism "super fast" masked IRQ interrupts are not possible.
My main worries:
+ Really, I kind of like this logic - I think it's very clean and the code for this is very optimal from what I can tell. I didn't really have to "hack" anything in, because the K65 supports a NRDY pin which can be brought high to say that the processor is not ready to continue for one reason or another (external chips can also bring NRDY high, it basically is two-way and allows the processor to tell you it is not ready because it needs an interrupt, or that some memory controller external to the CPU is not ready because the read/write is not finished).
+ HOWEVER, I do worry since it is named the same as WAI that people who are familiar with WAI might be confused. Really this is more like a "Block For Interrupt", because we literally
have to wait for an actual interrupt to continue. Should I rename this BLK or something like that to differentiate from WAI? Or should I restructure WAI to work like traditional WAI instead of doing my own thing here?
Different Functionality for ResetOn RES, it became painfully obvious that I'd really like the registers and PS to be set to some expected state (especially with the addition of the BP register). So one of the things I changed about BRK is that the last cycle we process an addition of 0x00 and 0x00, and then depending on if the processing interrupt is RES or not, we would signal a bunch of register sets (AC, IX, IY, BP, S) and update all of the PS flags (New State: cZI---vs) to get the processor into a known-good state. It does all this using existing busses and signal wires; so after the addition is processed the value 0x00 is on SB, and we conditionally turn all registers to sample SB if RES is the interrupt that we are handling.
Here are my concerns with where this might be confusing:
+ Right now, all registers get set to 0x00 -
including S (stack pointer). I think I can fix this without increasing cycle count (I have a spot where 0xFF is on SB where we could conditionally set S without adding a bunch of new connections). This would add a new signal wire though - think it's worth fixing? (I'm leaning heavily on yes for this, actually - but would like opinions) One of the things I don't like about this change is it distributes the reset logic over two separate cycles - with the current implementation there is exactly one half-cycle during PHI2 that we sample to see if RES is processing to signal all the other important wires, so it's very logically (and physically?) compact in its current implementation.
+ Because we set the processor status flags based on the final addition of 0x00 and 0x00, this means that all PS flags are low
except zero flag. I want to say this actually makes sense, because all registers have zeros in them - but I could see an argument against this by saying that the status of zero being set did not originate from any actual known operation of the program. I could fix this, but again this would require another signal wire to handle the special case of setting PSZ low instead of just re-using the ADDZxPSZ signal wire like I am now.
Interrupt Handling WorriesFor convenience of the simulation, interrupts are scheduled by pulling the bits high on the PIN bitfield in the CPU structure. When scheduling an interrupt, the PIN bitfield is sampled during PHI1 and if an interrupt should be schedule, a "pending flip-flop" is set per interrupt to show that an interrupt is pending for the next available interrupt time (for NMI/IRQ this is during SYNC, for RES a pending interrupt can stop mid-instruction and reset the whole processor).
After these PIN values are sampled and persisted in "interrupt-pending flip-flops", when we reach a SYNC point, the pending flip-flops are checked to see if we ought to start processing any interrupts. At this point, if any pending flip-flops are sampled into another set of flip-flops called "interrupt-processing flip-flops", and we drain IR so that we force a BRK instruction into the execution.
How BRK works is, when it goes to sample the interrupt vector, the constants it constructs in ADL are based on the "processing flip-flops". Mostly this is to ensure that you can't trip up the interrupt handling by signalling IRQ one cycle, then NMI the next causing one byte from IRQ and one byte from NMI to be read (which would be invalid). Of the processing flip-flops there is an order of precedence - RES, NMI, IRQ, BRK. That is, a RES interrupt will take precedence over an NMI interrupt, and so on. At the end of handling the interrupt, we pull all processing and pending flip-flops low, marking the interrupt as complete.
This logic worries me for the following reasons:
+ Say NMI and IRQ happen at the same time, in this logic we will "swallow" IRQ and never handle it. I could, without the addition of new signal wires, allow them to be scheduled so that we don't lose an interrupt, but should we? Definitely a RES should swallow NMI/IRQ since we are resetting the system, but if we get NMI and IRQ, should we process NMI and then process IRQ, or just process the more "important" signal and swallow the other one?
+ Does the priority I have assigned make sense? Specifically I'm worried about NMI vs. IRQ - if IRQ is unmasked and scheduled along with NMI, and I support handling multiple interrupts by priority order, which one should we handle first? Which one should we handle second? Basically, if I handle NMI interrupt first, then IRQ, we will be in a state where we are executing for IRQ, and after we RTI we'll go back to handle NMI interrupt. Does this sound appropriate, or should I want to be left in a state where we are handling NMI first, then RTI to handle IRQ?