Another productive weekend!
I’m very excited about where things are on the project. Most of the work at this stage is on refining the working model, and I could not be happier with the improvements. The package count is dropping and the circuits are getting simpler and faster at the same time - awesome!
A rich-source of improvements came as a result of over-hauling the interrupt handling mechanism. Dieter convinced me that I should abandon the approach I was using and revert to the more traditional method of a single microcode sequence to handle all interrupts - RESET, NMI, IRQ and BRK. The differences between these are in fact minor: (1) the address of the jump vector, (2) the fact that RESET inhibits writes to stack memory and, (3) the fact that BRK advances the Program Counter and sets the B flag when pushing P onto the stack. I originally thought that creating separate microcode sequences to handle these differences would be easiest, but in fact this turned out to be not the case at all. Having made the switch, I was able to remove all kinds of special circuitry and things became quite uniform - including that pesky Constant Generator which now could deliver the right value directly to the low-byte of the address bus based on the “currently active interrupt”. And doing that meant that I no longer needed to load IR from the W bus at all and could therefore drop a pair of MUXes used only for that purpose.
All-in-all, these changes well worth the effort and the new rev of the Registers schematic is much better for them. But the really exciting news is that I can finally share the ALU schematic as well !!! And so, after much work and testing, I am very pleased to post the Card B: ALU & CU schematic below:
Attachment:
Card B-ALU & CU.png [ 366.92 KiB | Viewed 3913 times ]
As I mentioned, the ALU borrows heavily from Dieter Mueller's (ttlworks) Multiplexer ALU (
http://6502.org/users/dieter/a1/a1_6.htm). I have come to admire this design a lot, and I especially like it because I found it just about when I was at a loss as to how to get a practical ALU built. Just like my TTL sequencer, a straight ahead gates based ALU would have been totally unwieldy and slow. This is one is very elegant, in my view. The overall structure is of a Logical Unit (LU) followed by an Adder and a Shifter.
In terms of how I've implemented here, the ALU has inputs A and B coming into it from the R Bus and the Data Bus respectively. B is latched on every cycle simply to split a memory read and a subsequent ALU operation over two cycles. The B input may alternatively by generated by the Sign Extension circuit, which saves the sign bit of a previous branch offset as it is loaded from memory to calculate the high-byte of a branch target address if a page-boundary is crossed.
The Logical Unit (LU) which is fed by the inputs and is comprised of 8 dual-multiplexers (74153s) which implement a lookup table for A and B - one multiplexer dedicated for each pair of bits in A and B. The function of the LU is driven by an LUOP consisting of four bits input to the data pins of the multiplexers. The net effect is that a given LUOP can cause a multiplexer to output a “1”, a “0”, the value of the A input, the value of the B input, Not A, Not B, A OR B, A AND B as well as A XOR B. Linking 8 multiplexers together therefore yields an 8 bit Logical Unit.
The LU is followed by a pair of adders which double up as a subtractor circuit when fed “A” and “NOT B” from the LU. The result of the adder then goes to a Shifter for right-shift operations (left-shift is accomplished by having the LU produce “A” on both inputs to the adder to add the value to itself). Each component can act as a pass-through under the control of various control lines to select the desired ALU operation (or combination thereof).
The BCD adjustment circuit taps the result of the adders and uses a 74151 multiplexer to detect values in excess of 0 - 9 which require adjustment. A pair of BCD adders then add 6 or 10 to perform the adjustment itself depending on whether we are adding or subtracting (10 being the two’s complement of 6). BCD Latch is used to latch an intermediate result and thereby spread the BCD operation over two cycles.
Rounding out the ALU are a number of flag “detection” and “selection” circuits to calculate the appropriate values for the “C”, “Z”, “V” and “N” flags. These take the result from the ALU (including from the B input directly in the case of a BIT operation) or in some cases a value directly from the Opcode (e.g., SEC, CLC - flag value is in bit 5 of the Opcode while bits 6 and 7 select which flag is affected). Finally, the P register takes the output of these flag circuits and latches them according to individual write-enable signals for each instruction.
As in the Registers Card, all the control logic is driven by a pair of ROMs which output signals into several decoders as follows:
A.MX is the simplest of these and is used only manage the A input of the ALU (“0”, A, B).
C.MX selects the Carry to use for various operations (IC being an Internal Carry used for address calculation) as well as controlling the Shifter and Sign-Extension circuits (LSR, ROR, ADS - Add Sign Ext, ADSIC - Add Sign-Ext using the Internal Carry).
The SF.MX decoder enables the write signals for the various flags in the P register, including writing to all of them during a PLP instruction. SF.MX can also directly implement setting the I flag (SEI) and clearing the D flag (CLD), although the latter is no longer used by the microcode. In addition, the OPCODE option tells the circuitry to set the appropriate flag based on the opcode itself (as described above). Finally, this decoder also enables the ALU itself (ALU.EN) and the BCD adjustment circuit (BCD.EN). When both these are disabled, data can bypass the ALU entirely, going directly from Data Bus to W Bus (ALU.BYPASS - used typically for register loads and transfers).
The NX.MX decoder drives the microcode sequencer and State Register (Q). As I’ve mentioned before, the State (Q) is used as the lower three bits of the address to the control ROMs to index into one of eight potential micro-instructions for every CPU instruction. Generally Q simply increments by 1 on each cycle and is set to 0 to begin the fetch-execute cycle. The microcode itself contains minimal logic facilities, with all functions handled in three NX.MX bits as follows:
Code:
0 - NEXT: Increment Q by 1
1 - END: Reset Q to zero on the next cycle. Ends the current instruction and will fetch a new opcode on the next cycle.
2 - END.C: for branches - set Q to 0 if no page boundary is crossed, otherwise increment Q by 1 to adjust ADH
3 - END.BTF: for branches - set Q to zero if the Branch Test Fails (branch not taken), otherwise increment Q by 1 to do the branch
4 - END1.DF: for decimal mode - set Q to 0 if the D flag is false, otherwise increment Q by 1 to perform a BCD operation
5 - END1: set Q = 1 to skip the fetch-opcode cycle for the next instruction when the opcode has been pre-fetched
6 - SKIP.ICC: skip the next micro-instruction if the Internal Carry is clear. Will skip unnecessary ADH calculations.
Most of the circuitry in NX.MX works to determine when the Q register should be set to 0 (the relevant signal is labeled /RESETQ in the schematic and is input to the /CLR pin of the Q 74161 counter). NX.MX decoder also derives the SKIP.TO.1 signal used for END1 and END1.DF, the SKIP.ICC signal, the BCD1 signal during the first cycle of a BCD operation and the INH.PF Inhibit Pre-Fetch signal to indicate if an interrupt is pending or one is active in cycles 0 and 1 of the interrupt sequence (INH.PF inhibits the incrementing of the Program Counter in those circumstances and will also inhibit a SKIP.TO.1 operation to ensure an interrupt is triggered in state Q=0).
The Interrupt Detection circuit to the right of and below the State Register (Q) is primarily concerned with setting the INT flag (Interrupt in progress), the /RST.INP flag (RESET In Progress) and the /NMI.INP flag (NMI In Progress). INT triggers the /IR.CLR signal to set the Instruction Register to zero (one of Dr. Jefyll’s suggestions) which in effect executes an implied BRK instruction. The /RST.INP and /NMI.INP flags then control the behaviour of the interrupt microcode sequence for each interrupt type (including selecting the correct interrupt vector address). These "In-Progress" flags are cleared by the /RESETQ signal which marks the end of every interrupt service sequence.
And that, I’m happy to say, is about it for the overall design. These two cards Registers and ALU & CU together comprise the bulk of the design. A lot of finicky boolean logic to get the decoders working right, but to be expected I think given the complexity of the instruction set. It’s true that the decoders could be radically simplified by extending the microcode to a wider control word, but this would contravene one of the specific design goals of the project - namely to keep the use of ROMs to a minimum. I know of a couple of optimizations I’ve yet to complete, and without a doubt there will be others yet to be discovered, but I dare say the design is starting to feel more complete at this stage. I have tested these circuits as best I can using Logisim and can say that by and large they are working - very pleased about that!
This continues to be an amazing learning experience - probably more like a crash course at this stage - and I’m loving picking up all the knowledge folks are so generously sharing. Many thanks to all once again. It feels great to get both these schematics posted and I’m looking forward to all the feedback.
My best to everyone,
Drass