Joined: Thu May 28, 2009 9:46 pm Posts: 8504 Location: Midwestern USA
|
Like Ed, I've been pondering this subject on and off for quite a while. At one point I was considering using the solution devised by Daryl (8BIT) for his SBC-3 unit. However, I've since changed course and as I've scribed in various posts in the past, am not only looking to support a lot of RAM (using Garth's 4 MB DIMM) but also thinking in terms of being able to set up a protected environment that can support true multitasking. The 65C816 poses some challenges in that regard, to which I've alluded in the past. In this post, I'm going to try to untangle my thoughts so it all makes a modicum of sense.
For the benefit of those who haven't read some of my other posts on this topic (the posts are somewhat fragmented, as they represent the electronic form of thinking out loud while chugging beer and eating pretzels), I'll reiterate what hardware should be able to do to create a protected environment that is capable of supporting preemptive multitasking:
- Differentiate between user and supervisor modes
In modern systems that are capable of running a true preemptive multitasking operating environment (e.g., Linux and UNIX), it is possible for the hardware to distinguish whether the system is under the control of a user mode or supervisor mode program, e.g., an operating system kernel. While it is possible to run a preemptive environment without such differentiation, overall system stability then rests on the notion that individual processes will obey certain execution rules and never stray from them.
- Enforce memory protection
The system logic must be able to terminate a process that improperly attempts to access memory or hardware before such an access can cause errors or outright fatality. The decision to terminate a process due to illegal accesses is influenced by whether the hardware is operating in user or supervisor mode.
- Limit use of privileged machine instructions
There are some machine instructions should never be executed in a multitasking environment and others that should only be executed while in supervisor mode. The system logic must be able to detect the attempted use of a privileged instruction while in user mode and terminate the errant program before the instruction can be executed.
The 65C816 lacks the ability to implement any of the above functions, as it was never intended to natively support a preemptive multitasking environment. Additionally, it has some wired-in characteristics that add complication:
- Banked memory
The 65C816's internal architecture, despite being capable of generating a 24 bit address, is really tied to the 16 bit address bus of the 65C02, causing the MPU to see memory in 64 kilobyte (KB) segments or banks, not as contiguous space. Specifically, program code is limited to a contiguous 16 bit address range—long branches and stack relative offsets are limited to ±32KB—and when the program counter (PC) reaches $kkFFFF it will wrap rather than increment the bank address (kk). In practice, this limitation is not all that onerous, as few programs are large enough to use up an entire bank. However, it complicates the use of smaller blocks of RAM, which can lead to memory fragmentation—a 1KB program could conceivably use up an entire bank, which may not be a significant problem, given the 65C816's ability to address 16MB of RAM. Something else to consider is that if a program uses a large part of a bank then it may have to look elsewhere in memory for data workspace.
Mention of data workspace brings up another addressing characteristic that may complicate memory management. Indexed instructions can cross bank boundaries by merely incrementing the index register, assuming that the base address is not $0000. That is, the effective address during loads and stores does not wrap but temporarily increments the MPU's data bank (DB) register—the "visible" value of DB doesn't change. This effect can be produced with the following code:
Code: lda #0 pha plb ;set data bank to $00 rep #%00010000 ;16 bit index registers ldx #$FFFE ;maximum possible index -1 lda $01,X ;loads from $00FFFF inx ;.X = $FFFF sta $01,X ;stores to $010000 Although the programmer may not have intended to cross banks during an indexed load or store, a logic error or unexpected data is all it would take for it to occur and possibly intrude on another process' workspace.
The 65C816 doesn't have separate outputs that correspond to the bank address and instead multiplexes A16-A23 (the bank address) on to the data bus during a valid memory cycle when Ø2 is low. As soon as Ø2 goes high the data bus is switched to data mode, which means external logic is required to capture and latch the bank address during the Ø2 low cycle. As the Ø2 rate is increased the available time in which to latch the bank address becomes very short, falling to around 10ns. Doing it with discrete logic is very difficult.
- Use of bank $00 for direct page accesses
No matter what is loaded into DB or the program bank (PB) register, an instruction that acts upon direct page will always force the effective bank address to $00, an artifact of the 65C816's ability to emulate a 65C02. While it is possible to change the direct page (DP) register so each process has its own direct page, the access remains "hard wired" to bank $00 and there is nothing to prevent a process from loading DP with an address in use by another process.
- Use of bank $00 for hardware stack accesses
The 65C816's 16 bit stack pointer (SP) is a significant improvement over the 65C02's 8 bit SP, giving the programmer considerable flexibility. However, no matter what address is loaded into SP, any instruction that implicitly addresses the stack (e.g., PHA) will always force the effective bank address to $00, again an artifact of 65C02 emulation mode. While it is possible to change SP so each process has its own stack, the access remains "hard wired" to bank $00 and there is nothing to prevent a process from loading SP with an address in use by another process.
- Use of bank $00 for interrupt vectors
When an interrupt occurs the 65C816 will push PB and PC to the stack and then set PB to $00 before loading PC with the relevant vector. Hence the front ends of interrupt service routines (ISR) must be in bank $00—execution, of course, can be transferred to another bank with JML or JSL if desired. This is also true for the reset vector, as toggling RESETB reverts the 65C816 to emulation mode, which has no real concept of banks. This characteristic means that some ROM must appear at the top of bank $00 from which a reset program can execute, and at least of the kernel's ISR must be loaded into bank $00 as well.
The 65C816 doesn't change DB to bank $00 when an interrupt occurs, which may not necessarily be a desirable behavior in all cases. However, it does mean that the ISR can access data in the interrupted process' bank without having to specifically know what that bank was at time of the interrupt. This has important implications for a kernel whose services are called by pushing a stack frame and executing BRK or COP.
- No supervisor mode
The 65C816 doesn't change its behavior when an interrupt occurs, nor does it produce any output signal that specifically informs the rest of the system that it is processing an interrupt. Therefore, the '816 can't tell the system logic when to relax or enforce memory access or privileged instruction execution rules, which leads to...
- No privileged instructions
MPUs such as the x86 and Freescale (Motorola) 68000 (68K) series have a so-called "rings of privilege" feature, in which the MPU can be set up to generate a processing exception when certain instructions are executed. This feature may be used to prevent the execution of instructions in user mode programs that should be restricted to supervisor mode. The 65C816 has no such capabilities, which opens the door to system fatality or the appearance of fatality due to the use of certain instructions at inopportune moments. 65C816 instructions that would fall into this group include:
- STP
, which will cause the MPU to cease all processing until a hard reset occurs.
- XCE
, which will switch the MPU from native to emulation mode if carry is set.
- RTI
, which in addition to possibly unbalancing the stack, may load garbage values into PB and PC, causing a loss of control.
- SEI
or any instruction that can set the I bit in the status register and disable IRQs. In this category are SEP, PLP and (again) RTI. In a preemptive multitasking system, a jiffy IRQ is used to trigger task switching, which means that disabling IRQs will disable multitasking and eventually cause deadlock.
- WAI
, which halts the MPU until a hardware interrupt occurs. WAI in itself is "harmless" if a device causes an IRQ shortly after WAI is executed, as the MPU will resume execution following WAI upon receipt of the next hardware interrupt. However, preceding WAI with SEI will produce markedly different behavior upon receipt of an interrupt than what would occur if SEI hadn't been executed—the normal interrupt vector will not be taken.
At this point you might be thinking that setting up a protected environment with the 65C816 is a lost cause or will require so much support hardware that doing so is impractical. So I thought at first. However, after quite a bit of cogitation I decided that if sufficiently powerful system logic (CPLD or FPGA) is used all of the above items can be addressed in various ways. Let's look at each item and see how a solution might be applied. The following won't be in the order presented above but the reason for that will become obvious as you read:
- Implementation of execution modes
As noted, when an operating system (kernel) call occurs via a software interrupt, or when a hardware interrupt occurs, it is useful to be able to establish a supervisor mode. The 65C816 has no output signal that indicates when it is processing an interrupt or that it has completed interrupt processing. However, it does have the vector pull (VPB) output, which is asserted (negated) during cycles seven and eight of interrupt processing, at which time the relevant interrupt vector is loaded into PC. WDC actually intended for designers to use VPB to implement hardware interrupt steering by altering the vector on the fly in a system that includes a interrupt priority encoding function.
It is possible to (mis)use VPB to indicate to system logic that an interrupt is in process by having VPB drive a latch whose state indicates whether the system is in user or supervisor mode. A latch is needed because VPB's state is ephemeral—see page 17 in the 65C816 data sheet for more information.
The complication to this is in figuring out how to tell the system logic to switch back to user mode upon completion of interrupt processing. As noted before, the 65C816 has no output that can indicate such a status change. This missing feature can be simulated by having the system logic toggle the user/supervisor mode latch when the MPU fetches an RTI instruction. Conveniently, this procedure ties in with the need to "sniff" the data bus during the MPU's opcode fetch cycle to block the use of privileged instructions by user mode processes (discussed in the next section). A logic equation of the following pseudo-code form could be implemented in the CPLD or FPGA to switch back to user mode:
Code: RTI = $40 /* RTI opcode */ OpcodeFetch = VDA & VPA /* detect opcode fetch cycle */ UserMode = OpcodeFetch & (D0...7 = RTI) /* set user mode if executing RTI */ Logical AND is represented by the & symbol. The 65C816 indicates that it is fetching an opcode when the expression VDA (valid data address) & VPA (valid program address) is true.
- Implementation of privileged instructions
.
As noted above, switching from supervisor to user mode is accomplished by watching for the execution of an RTI instruction. As this requires the presence of a mechanism to read the data bus during the opcode fetch cycle we can use the same mechanism to police general instruction usage and trigger an abort if an illegal instruction is fetched. To do so requires a list of instruction opcodes that are to be watched and some rules under which instruction usage is or is not allowed. Execution of the following instructions should be completely prohibited:
Execution of the following instructions should be prohibited in user mode:
As the CPLD or FPGA responsible for implementing system logic would "know" which mode is in use, the additional logic required to enforce the above rules would be straightforward:
Code: /* "never execute" instructions... */
STP = $DB XCE = $FB
/* "supervisor mode only" instructions... */
RTI = $40 SEI = $78 WAI = $CB
/* trigger an abort if illegal instruction is executed... */
ABORT = ((STP | XCE) | ((SEI | WAI) & UserMode)) & OpcodeFetch & Ø2 Logical OR is represented by the | (UNIX pipe) symbol.
A note on aborting instructions. ABORTB is a level-sensitive input, which implies that faulty logic timing could cause the aborted instruction to not abort and thus modify registers or memory. Bad timing could also cause a double abort, that is, the abort ISR itself could be aborted, with undefined consequences. Therefore the above abort logic is qualified by the OpcodeFetch intermediate value that was defined in an earlier logic statement, as well as by the Ø2 clock. OpcodeFetch will be true only during the first cycle of instruction execution, which means ABORT will automatically return to the false state when the MPU moves to the next step in the instruction sequence. As all instructions require a minimum of two clock cycles to execute, it is feasible for really fast logic (10ns pin-to-pin, which many CPLDs can manage) to abort during the first cycle, since the MPU's operation will always complete an instruction before acknowledging an interrupt.
For example, consider the case of an attempt to execute STP, which is forbidden under any circumstances. With instruction policing logic, the MPU's actions would be as follows:
Code: CYCLE VDA VPA OpcodeFetch MPU ACTION ABORTB ———————————————————————————————————————————————————————————————— 1 1 1 true opcode fetch ($DB) true 2 0 0 false internal operation false 3 0 0 false halts pending a reset false 1 1 1 true start of interrupt false ————————————————————————————————————————————————————————————————
The above won't work, however—I'll explain in a bit.
VDA and VPA both go true during the first cycle, indicating that the opcode is being fetched, this occurring approximately 20ns after the fall of Ø2 at 20 MHz. Assuming system logic has a 10ns propagation time, ABORTB would be asserted (negated) approximately no more than 10ns after the rise of Ø2 (since ABORT is qualified by Ø2 high), which satisfies the 65C816's requirements for ABORTB setup timing. VDA and VPA will go false shortly after the fall of Ø2 and within 10ns of the fall of Ø2, ABORTB will be deasserted, which would be during the beginning of cycle two of the instruction, again satisfying the data sheet's recommendations. In theory, following the completion of cycle three, which is when the 65C816 would normally stop its internal clock and halt processing, the effect of the abort would occur, and instead of halting, the MPU would process the abort interrupt.
Unfortunately, an abort interrupt doesn't actually abort an instruction, as was determined by simulation at WDC. It merely causes any computed results from that instruction to be discarded. Hence an abort interrupt will not abort STP or WAI, which implies that blocking the execution of these instructions isn't possible by any apparent means. There is a way, however, to trick the MPU into thinking that the instruction is something other than STP...
- Generating a bank address
As previously noted, the A16-A23 address component has to be generated by external logic, since the 65C816 uses the data bus (D0-D7) to emit a bank address during Ø2 low. Practically speaking, D0-D7 would drive eight latches (shown as a 74xx573 or 74xx373 in WDC's reference circuit on page 46 in the data sheet) that would be open while the expression:
was true, where ! indicates logical NOT. As soon as Ø2 goes high the latches would close and retain the bit pattern that had been present on D0-D7.
The outputs of the latches would become A16-A23, and actual chip selects would be subject to further decoding, using logic similar to that of a 3-to-8 discrete decoder. Using one of Garth's 4 MB DIMMs as an example, A16-A18 from the CPLD/FPGA bank latches would directly drive the corresponding inputs on the DIMM (pins 33, 23 and 18/29, respectively). A19-A21 would be used to select one of the eight SRAMs on the DIMM according to the following table:
Code: BANK A21 A20 A19 /CE7 /CE6 /CE5 /CE4 /CE3 /CE2 /CE1 /CE0 —————————————————————————————————————————————————————————————————————— 00-07 0 0 0 0 0 0 0 0 0 0 1 08-0F 0 0 1 0 0 0 0 0 0 1 0 10-17 0 1 0 0 0 0 0 0 1 0 0 18-1F 0 1 1 0 0 0 0 1 0 0 0 20-27 1 0 0 0 0 0 1 0 0 0 0 28-2F 1 0 1 0 0 1 0 0 0 0 0 30-37 1 1 0 0 1 0 0 0 0 0 0 38-3F 1 1 1 1 0 0 0 0 0 0 0 ——————————————————————————————————————————————————————————————————————
The above table would map the DIMM into the range $000000-$3F0000.
It should be noted that correct operation of the above scheme is contingent on all memory cycles being qualified by VDA and VPA. The expression
occurs during the intermediate cycles of some instructions, especially those that use indexing. During that time, the state of A0-A15 is undefined and D0-D7 may contain values that are actually internal intermediate results as the MPU processes the instruction. Therefore, chip selects should never be asserted unless the expression:
is true.
Discrete logic using a standard 74xx138 decoder can be made to generate the DIMM chip selects. However, the cascading of logic—the A0-A3 inputs on the 74xx138 would be driven by the Q-outputs of the 74xx573 (which would also drive A16-A18 on the DIMM)—would set a hard limit on the maximum speed at which the system could run, even if using 74ABT and/or 74AC devices. At 20MHz, propagation delays would be such that selection wouldn't occur until after Ø2 had gone high, leaving the SRAM hardware little time to respond to selection. Needless to say, this sort of thing is best implemented in a CPLD or FPGA.
I spent some time with this in Atmel's WinCUPL and came up with a set of logic equations that will work on their 15xx series of CPLDs (I'm going to use the 1508as). Here's the salient part of the code:
Code: /* * * * * * * * * * * * PIN ASSIGNMENTS * * * * * * * * * * * */
pin = !ABORT; /* MPU ABORTB (output) */
pin = A0; /* address line $000001 (input) */ pin = A1; /* address line $000002 (input) */ pin = A2; /* address line $000004 (input) */ pin = A3; /* address line $000008 (input) */ pin = A4; /* address line $000010 (input) */ pin = A8; /* address line $000100 (input) */ pin = A9; /* address line $000200 (input) */ pin = A10; /* address line $000400 (input) */ pin = A11; /* address line $000800 (input) */ pin = A12; /* address line $001000 (input) */ pin = A13; /* address line $002000 (input) */ pin = A14; /* address line $004000 (input) */ pin = A15; /* address line $008000 (input) */
pin = A16; /* address line $010000 (output) */ pin = A17; /* address line $020000 (output) */ pin = A18; /* address line $040000 (output) */
pin = D0; /* data line $01 (input/output) */ pin = D1; /* data line $02 (input/output) */ pin = D2; /* data line $04 (input/output) */ pin = D3; /* data line $08 (input/output) */ pin = D4; /* data line $10 (input/output) */ pin = D5; /* data line $20 (input/output) */ pin = D6; /* data line $40 (input/output) */ pin = D7; /* data line $80 (input/output) */
pin = !EPCE; /* ROM chip select (output) */ pin = EWS; /* low = add a wait-state (input) */
pin = !DS0; /* device select (output) */ pin = !DS1; /* device select (output) */ pin = !DS2; /* device select (output) */ pin = !DS3; /* device select (output) */ pin = !DS4; /* device select (output) */ pin = !DS5; /* device select (output) */ pin = !DS6; /* device select (output) */ pin = !DS7; /* device select (output) */
pin = IRQ0; /* device interrupt (input) */ pin = IRQ1; /* device interrupt (input) */ pin = IRQ2; /* device interrupt (input) */ pin = IRQ3; /* device interrupt (input) */
pin = !IRQB; /* MPU IRQB (output) */ pin 83 = PHI2; /* system clock (input) */ pin = !RD; /* read data (output) */ pin = RDY; /* MPU RDYB line (input/output) */
pin = !D0RS0; /* DIMM A RAM 0 select (output) */ pin = !D0RS1; /* DIMM A RAM 1 select (output) */ pin = !D0RS2; /* DIMM A RAM 2 select (output) */ pin = !D0RS3; /* DIMM A RAM 3 select (output) */ pin = !D0RS4; /* DIMM A RAM 4 select (output) */ pin = !D0RS5; /* DIMM A RAM 5 select (output) */ pin = !D0RS6; /* DIMM A RAM 6 select (output) */ pin = !D0RS7; /* DIMM A RAM 7 select (output) */
pin 1 = RESET; /* system reset (input) */ pin = RWB; /* MPU RWB (input) */ pin = VDA; /* MPU VDA (input) */ pin = VPA; /* MPU VPA (input) */ pin = VPB; /* MPU VPB (input) */ pin = !WD; /* write data (output) */
/* * * * * * * * * * * * * * * * * BURIED LOGIC DECLARATIONS * * * * * * * * * * * * * * * * */
pinnode = [dffa16..23]; /* A16-A23 latches */
/* * * * * * * * * * * * * * * * * GLOBAL FIELD DECLARATIONS * * * * * * * * * * * * * * * * */
field a3_a0 = [A0..3]; /* address bits 0-3 */ field a11_a8 = [A8..11]; /* address bits 8-11 */ field a15_a12 = [A12..15]; /* address bits 12-15 */ field addrbus = [A0..A15]; /* address bits 0-15 */ field databus = [D0..7]; /* data bus bits 0-7 */ field extaddr = [dffa23..16]; /* address bits 16-23 */
/* * * * * * * * * * * * * * * GLOBAL CONTROL LOGIC * * * * * * * * * * * * * * */
vbus = (VDA # VPA) & RESET; /* true = address bus valid */
/* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * MEMORY ADDRESS TRANSLATION LOGIC * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * */
/* register resets... */
$REPEAT i = [0..7] dffa{i+16}.ar = !RESET; dffa{i+16}.ap = 'b'0; $REPEND
/* bank latching logic... */
$REPEAT i = [0..7] dffa{i+16}.ck = vbus & !PHI2; $REPEND $REPEAT i = [0..7] dffa{i+16}.d = vbus & !PHI2 & D{i}; $REPEND
/* memory selection logic... */
dimmsel0 = !dffa23 & !dffa22; /* DIMM 0 select */ dimmsel1 = !dffa23 & dffa22; /* DIMM 1 select */ dimmsel2 = dffa23 & !dffa22; /* DIMM 2 select */ dimmsel3 = dffa23 & dffa22; /* DIMM 3 select */
sramsel0 = !dffa21 & !dffa20 & !dffa19; /* SRAM 0 select */ sramsel1 = !dffa21 & !dffa20 & dffa19; /* SRAM 1 select */ sramsel2 = !dffa21 & dffa20 & !dffa19; /* SRAM 2 select */ sramsel3 = !dffa21 & !dffa20 & dffa19; /* SRAM 3 select */ sramsel4 = dffa21 & !dffa20 & !dffa19; /* SRAM 4 select */ sramsel5 = dffa21 & !dffa20 & dffa19; /* SRAM 5 select */ sramsel6 = dffa21 & dffa20 & !dffa19; /* SRAM 6 select */ sramsel7 = dffa21 & dffa20 & dffa19; /* SRAM 7 select */
/* A16-A18 outputs... */
A16 = dffa16 & vbus; /* address line $010000 */ A17 = dffa17 & vbus; /* address line $020000 */ A18 = dffa18 & vbus; /* address line $040000 */
/* SRAM selection outputs... */
D0RS0 = dimmsel0 & sramsel0 & vbus; /* DIMM 0 SRAM 0 */ D0RS1 = dimmsel0 & sramsel1 & vbus; /* DIMM 0 SRAM 1 */ D0RS2 = dimmsel0 & sramsel2 & vbus; /* DIMM 0 SRAM 2 */ D0RS3 = dimmsel0 & sramsel3 & vbus; /* DIMM 0 SRAM 3 */ D0RS4 = dimmsel0 & sramsel4 & vbus; /* DIMM 0 SRAM 4 */ D0RS5 = dimmsel0 & sramsel5 & vbus; /* DIMM 0 SRAM 5 */ D0RS6 = dimmsel0 & sramsel6 & vbus; /* DIMM 0 SRAM 6 */ D0RS7 = dimmsel0 & sramsel7 & vbus; /* DIMM 0 SRAM 7 */ The above selection code simulates as expected. Note that I have a hook in place for supporting up to four DIMMs.
- Allocating and protecting memory
The 65C816 has no understanding of memory boundaries other than 64KB banks, which complicates the allocation and protection of memory. Therefore, unless very sophisticated external logic is employed, the minimum amount of memory that can be allocated and protected is 64KB. Hardware that can emulate the page table methodology used to "sandbox" processes in other machine architectures is not easily developed by the average hobbyist and in fact, would require pretty intelligent silicon and probably some dedicated SRAM to store page table data. While not as efficient in memory utilization as a page table system, allocating memory by banks is not too difficult to implement with reasonable hardware
In order for system logic to protect a bank from unauthorized access, it is necessary for it to know the following:
- Bank in which the currently running process is executing;
- Bank in which the currently running process is storing data;
- Execution mode of the currently running process.
We've already covered the establishment of user and supervisor modes, and the previous section described how a bank number would be latched. So the building blocks for a memory allocation and protection scheme have been defined. All that is needed is some logic (again in pseudo-code) to detect an improper access:
ILLEGAL_ACCESS = !SUPERVISOR_MODE & ACCESS_BANK != PROCESS_EXEC_BANK & ACCESS_BANK != PROCESS_DATA_BANK where ! is logical NOT.
PROCESS_EXEC_BANK and PROCESS_DATA_BANK are implemented as eight bit registers set up in the CPLD/FPGA, and are updated when a context change occurs. ACCESS_BANK is the bank generated by the MPU on each read or write access during a valid memory cycle, and is latched as described in the previous section. Succinctly stated, a user-mode process has attempted an illegal access if it tries to read or write a bank other than its own data or program bank. In such a case, the hardware would abort the instruction.
An observant reader will notice that the above rule would cause an error during an access to direct page or the stack. Obviously, something has to be done to prevent what should be a valid access from triggering an abort.
- Remapping bank $00
As previously noted, direct page and stack accesses are always directed to bank $00 and hence create a possible source of inter-process conflict, as well as the possibility of an illegal memory access error while running in user mode. It makes sense to arrange to the system logic so that it is possible for a reference to a bank $00 address to be remapped to the same address in the user mode process' bank. Therefore an illegal memory access error during direct page and stack operations won't occur, as the process will be addressing its own bank.
The principle is quite simple when in user mode:
Code: IF ACCESS_BANK == $00 | ACCESS_BANK == PROCESS_EXEC_BANK | ACCESS_BANK == PROCESS_DATA_BANK EFFECTIVE_BANK = ACCESS_BANK ELSE ILLEGAL_ACCESS ENDIF A complication arises when an interrupt occurs, in that the MPU forces bank $00 regardless of the current value in PB. There are four possible solutions to this dilemma:
- Continue to map bank $00 to the in-context program bank and expose a small ROM image of the interrupt service routines (ISR) at $FF00 when an interrupt is detected so the MPU has a valid vector through which it can jump. The code at $FF00 would be sufficient to push the MPU registers to the in-context stack and then long-jump (JML) to the body of the relevant ISR, wherever it might reside.
- Continue to map bank $00 to the in-context bank, but maintain a copy of the ISR front ends in RAM at $FF00, which copies would have to appear in all banks in which code execution might occur. Behavior, other than mapping in ROM, would be as in solution number 1.
- Continue to map bank $00 to the in-context bank, but make ISR front end code at $00FF00 appear in place of what is at that address in the in-context bank. That code could push the balance of the registers to the in-context stack and then long-jump to the rest of the ISR.
- Automatically switch off bank $00 remapping when an interrupt is detected, thus exposing ISR front end code in bank $00.
Pros and cons for each solution are:
- This solution requires that system logic make a piece of ROM appear each time an interrupt occurs, which gives rise to some potentially difficult timing issues, e.g., ROM not appearing fast enough to present a valid address when the MPU loads the interrupt vector. Wait-stating would be required in some cases to accommodate the relatively slow ROM to an MPU running at a high Ø2 clock rate, with an predictable degradation in interrupt response performance.
- This solution will cause breaks throughout large sections of memory, preventing the use of two or more contiguous 64KB segments from being treated as a single chunk of RAM for data storage.
- This solution would require that complex system logic be written with a special-case rule that would make the effective bank be $00 only while an interrupt is in progress and opcode or operand fetch is in progress, i.e., if the expression:
IS_INTERRUPT & (VDA & VPA | !VDA & VPA) is true.
- This solution would create the awkward situation where PB, PC and SR are pushed to the stack of the in-context (interrupted) process but subsequent pushes to save the registers push them to a stack in physical bank $00, presumably that of the kernel. This "split stack" arrangement could complicate the restoration of the MPU's state upon completion of interrupt servicing.
Solution number 3 seems to be the most elegant but also the most difficult to implement. In any case, execution of SEI would automatically restore bank $00 remapping.
—————————————————————————————————————————————————————————————————————— EDIT: I clarified what I meant by "an instruction that acts upon direct page will always force the bank address to $00."
_________________ x86? We ain't got no x86. We don't NEED no stinking x86!
Last edited by BigDumbDinosaur on Mon Apr 03, 2017 8:44 pm, edited 1 time in total.
|
|