MMU units for 6502 and 65816

fachat · Post by **fachat** » Sun Jun 25, 2023 7:03 pm

Using NMIs for regular interrupts is a no-no if you have timing-sensitive I/O operations ongoing.

For example, I switch off NMIs for the C64 9600 baud userport interface every time I need to access the floppy disk via that (stupidly slow, but that's another story) timing-sensitive IEC disk interface of the Commodore C64.

There are some heavy ifs in your statements as I read it. IF the timer is the only NMI source AND IF the MMU disallows NMIs during kernel (and device drivers?) - trying to get that management right is a complex task.

What do you gain? You might catch a hanging process? But that process is only hanging if it executes an SEI to avoid context switches. Otherwise the kernel can jump in an monitor the process (e.g. see if it continuously executes the same addresses on every interrupt).

I'd rather have the MMU break out of the task if it executes an SEI instead, as, for userspace programs, this should be an illegal instruction. That's probably the only useful source/reason for an NMI I can imagine right now (except such extremely timing critical device drivers like mentioned above).

Proxy · Post by **Proxy** » Sun Jun 25, 2023 8:08 pm

hmm, i still think using NMI is much much easier to implement than snooping the bus for illegal instructions.

plus the way i would design device drivers (ie calling them from the OS while in supervisor mode) means you would never get an NMI during an IO operation.
and of course there are a lot of "ifs", since the 65816 has no standard hardware a lot of software decisions (like which interrupt to use for a timer) depend entirely on the design of the hardware.

though that makes me think, what if you were to use both?

let's say you also use the CPLD/FPGA MMU as a priority interrupt encoder which goes to the CPU's IRQ pin. with the timer having the highest priority.
everytime the timer interrupt fires the MMU sends that to the CPU and increments an internal counter. this counter gets reset whenever the CPU writes to one of the segment registers. meaning it only resets when the OS switches tasks.

this means if a task has disabled interrupts either by mistake or on purpose to avoid being switched out, the MMU would detect that via the counter and send an NMI after x amount of missed timer interrupts.
at that point the OS has control again and can terminate the task or similar.

sure it's more complicated than just using NMI by itself, but still seems easier than snooping the bus and comparing each opcode fetch to a list of instructions that should be ABORT'ed.

or you just trust user programs like on the Amiga. which means system crashes will be possible from a user task, but at the same time what do you lose besides a bit of time?

gfoot · Post by **gfoot** » Thu Dec 07, 2023 12:11 am

fachat wrote:

There are some heavy ifs in your statements as I read it. IF the timer is the only NMI source AND IF the MMU disallows NMIs during kernel (and device drivers?) - trying to get that management right is a complex task.

What do you gain? You might catch a hanging process? But that process is only hanging if it executes an SEI to avoid context switches. Otherwise the kernel can jump in an monitor the process (e.g. see if it continuously executes the same addresses on every interrupt).

I'd rather have the MMU break out of the task if it executes an SEI instead, as, for userspace programs, this should be an illegal instruction. That's probably the only useful source/reason for an NMI I can imagine right now (except such extremely timing critical device drivers like mentioned above).

So I somehow missed this thread when it was resurrected earlier this year - the multitasking system I started designing this week shares a lot in common with these discussions.

I had been planning to use NMI to force processes to yield if they were otherwise not cooperating, regardless of whether they set the I flag, but last time I tried to do this I ran into problems with NMIs potentially occuring while in supervisor mode. Generally on returning from any interrupt I want to also leave supervisor mode, but if NMIs can occur in supervisor mode then that's incorrect. I had planned to just keep the jiffy timer reset while in supervisor mode, but investigating how the modern 65C02 handles overlapping NMIs and IRQs confirms that if an NMI occurs during an IRQ sequence then the NMI will be latched and processed after the IRQ sequence. The NMI transition itself could have occured several cycles before the BRK/IRQ sequence began, so it is not trivial to deal with this case.

One possible option to allow use of NMI for this is to arrange for the NMI vector to point to an RTI instruction, if the NMI occured during supervisor mode. This is... hairy. I'm trying to avoid that level of bus snooping/indirection in my project, but it could work for others I guess. Making the NMI vector be odd in general, and get decremented by one if the system was in supervisor mode at the time the NMI vector is fetched, would allow you to put an RTI instruction just before the real NMI handler, and have that execute harmlessly and automatically.

I've also considered arranging to latch the supervisor flag in hardware when the NMI vector is fetched, and allow the NMI handler to read that state back and let the software decide to RTI straight away if it was the supervisor that was interrupted. Again, this would only ever happen on the very first instruction following an IRQ, but it would add some latency to that. Yet another option is to use a hardware stack (bidirectional shift register) to track the state of the super bit before any interrupt - so on vector fetch, the super bit is shifted into the register, and on RTI the register is shifted back again, with the outgoing bit being used to determine whether the system remains in supervisor mode or not after the RTI. It would have limited depth, but would only need two bits of storage anyway unless NMIs get nested a lot.

Instead though I am revisiting my reasons for using NMI for this. The main one was so that user processes setting the I flag wouldn't block the system up; another was that the edge-triggered nature of NMI is appealing for this purpose. Focusing on the former though, there are still other ways user processes could block the system, e.g. STP which has been mentioned before, along with quite a few 816-specific instructions you might want to disable. I see the value in having a mechanism to do this, but also don't want to do that for my project, so am considering other options.

The main option that strikes me is that if I just want to terminate the badly-behaving process, then why not trigger a reset at that point? This is similar to Proxy's suggestion, but using RESB instead of NMIB, the advantage being that RESB can interrupt STP. As he suggested, this wouldn't be used as a general-purpose pre-empting mechanism - you'd use a timer driving IRQ for that, and reserve this reset behaviour for a fallback watchdog timer. I think it could work very well - any process that disables interrupts or executes a STP would just get killed, a little inefficiently but it should be rare.

There is potentially some use for user processes to be able to disable interrupts - one example is atomic access to a multi-byte quantity, perhaps something that is updated by an interrupt or another process. The more permissive model I'm suggesting here - allowing SEI so long as it's not left set for too long - would permit this kind of behaviour without harming the overall system much. It also means we can allow PLP without having to do even more bus snooping to ensure that the I bit is clear.

The missing element with RESB is that the program counter and status register are not pushed to the stack. But that's fine, because we're killing the whole process anyway, we don't care what it was doing at the time.

And of course it would be necessary to have a mechanism for differentiating between a power-on reset, and a process-kill reset. There are various options for that - the BBC Micro uses the state of a VIA register (IER) to detect this case, with that VIA having an electrically-separate, power-on-only reset signal from the one supplied to the CPU and other system components. But any latch in the system that can be reset on power-on but not reset by this process-kill mechanism, would do the trick.

So at least within the scope of what I want from an MMU, this seems like it might be a reasonable solution instead of bus snooping. Are there any glaring problems with it - any other instructions that would need to be prevented?

BigDumbDinosaur · Post by **BigDumbDinosaur** » Thu Dec 07, 2023 8:15 pm

gfoot wrote:

So I somehow missed this thread when it was resurrected earlier this year - the multitasking system I started designing this week shares a lot in common with these discussions.

This topic has been running for nearly 20 years.

It seems be stuck in a metaphoric loop.

In any case, running a multitasking environment on the NMOS 6502 is mostly an exercise in futility, in my opinion. It can be done with the 65C02 (I’ve been there and done that), but efforts to do so will be hampered by the C02’s inability to restart an instruction that had to be aborted due to a page fault, access violation, etc.

Although not designed for use in a preemptive, multitasking environment, any effort expended in setting up such an arrangement would be far more profitable with the 65C816, mainly because it has an ABORT interrupt, as well as features that make it easier for system logic to know what is happening at any given instant. For example, monitoring VDA and VPA will always tell logic where the 816 is in the instruction cycle. You don’t have that with the 65C02, whose SYNC output only tells you when an instruction opcode is being fetched. That won’t help in trying to police for instructions that would “touch” memory outside of the allowed address range for the particular program that is currently executing.

Quote:

I had been planning to use NMI to force processes to yield if they were otherwise not cooperating...

Cooperative “multitasking” is not multitasking at all. Forcing a process to yield is the job of an IRQ triggered by a free-running timer. Old DOS-based versions of Microsoft Window$ used cooperative multitasking and were noted for the ease at which a single application could crash the machine.

Your main concern would be what happens if a user-land process disables IRQs. That’s something you could address with a watchdog timer wired to NMI. Within your IRQ handler (not the NMI handler), you’d have a code snippet to reset the watchdog on each IRQ. If IRQ processing ceases, such as due to an application executing SEI, the watchdog would eventually time out and force an NMI. The NMI handler’s job would be to re-enable IRQs by rewriting the stack copy of SR (status register) that was pushed by the MPU when it acknowledged the NMI. Upon RTIing out of the NMI handler, the i bit in SR will be cleared and IRQ processing will resume. With a 100 Hz jiffy IRQ, for example, you could set the watchdog to time out in 100-or-so milliseconds, minimizing the risk of deadlock. Presumably, your IRQ handler won’t take 100ms to complete.

NMI in the 65xx family is edge-sensitive, which means a wired-OR NMI circuit can effectively deadlock the machine should one of the devices wired to NMI goes low while the MPU is servicing a previous NMI and the NMI code doesn’t recheck every possible NMI source before returning to the foreground. Even there, a potential exists for an NMI to sneak in one cycle after RTI has been fetched and thus get missed. For this reason, I strongly advise against using NMI for routine process scheduling.

I should note that in all the years I’ve worked with this stuff (over 50, at this point), I have never seen non-maskable interrupts used for anything other than responding to a single high-priority event. The minis I used to work with monitored NMI to recognize when a “dumb” UPS had detected power failure so an orderly shutdown could be commenced. That was the only connection to the NMI circuitry.

In my POC units, I have a debounced push button wired to NMI to interrupt a runaway program and give control to the M/L monitor. That is the only use for NMI.

Quote:

...investigating how the modern 65C02 handles overlapping NMIs and IRQs confirms that if an NMI occurs during an IRQ sequence then the NMI will be latched and processed after the IRQ sequence. The NMI transition itself could have occured several cycles before the BRK/IRQ sequence began, so it is not trivial to deal with this case.

And then there is the case in which an NMI and IRQ simultaneously hit while the MPU is fetching BRK. So much for maintaining control over interrupt priority!

Quote:

One possible option to allow use of NMI for this is to arrange for the NMI vector to point to an RTI instruction, if the NMI occured during supervisor mode. This is... hairy. I’m trying to avoid that level of bus snooping/indirection in my project, but it could work for others I guess. Making the NMI vector be odd in general, and get decremented by one if the system was in supervisor mode at the time the NMI vector is fetched, would allow you to put an RTI instruction just before the real NMI handler, and have that execute harmlessly and automatically.

Nothing like adding some complication where none is needed.

Quote:

Instead though I am revisiting my reasons for using NMI for this. The main one was so that user processes setting the I flag wouldn’t block the system up; another was that the edge-triggered nature of NMI is appealing for this purpose...I think it could work very well - any process that disables interrupts or executes a STP would just get killed, a little inefficiently but it should be rare.

See my above solution to handling an errant SEI instruction, or an LDA #%00000100 - PHA - PLP sequence setting i. It would be automatic, and if you have a means of identifying the offending process, you could also change the RTI address on the stack to return to somewhere in your kernel, instead of to the offender. Again, this becomes a whole lot easier with the 65C816, since it has easy-to-use stack-relative addressing modes.

Quote:

There is potentially some use for user processes to be able to disable interrupts - one example is atomic access to a multi-byte quantity, perhaps something that is updated by an interrupt or another process.

Why would that matter? If your preemptive, multitasking kernel is correctly written, a context change in such a situation should cause no problems whatsoever when your preempted process runs again.

Quote:

The more permissive model I’m suggesting here...

It is a fairly religious rule in an operating system intended to support multiple tasks that user-land processes are not allowed to tinker with interrupts (or directly touch any hardware). Any safety mechanism you might deploy would be there to prevent a user-land process from disabling IRQs and possibly bringing the system to a screeching halt.

Quote:

The missing element with RESB is that the program counter and status register are not pushed to the stack. But that’s fine, because we’re killing the whole process anyway, we don’t care what it was doing at the time. And of course it would be necessary to have a mechanism for differentiating between a power-on reset, and a process-kill reset.

The logical thing to do if you are going to try to regain control via a hard reset is to soft-vector reset to code that will fix up the stack pointer, make sure IRQs are enabled, and decimal arithmetic is cleared, and then re-enter the kernel at a designated point. The only significant problem I see with this approach will be in figuring out who the offender is so it can be removed from the kernel’s run queue.

Quote:

So at least within the scope of what I want from an MMU, this seems like it might be a reasonable solution instead of bus snooping. Are there any glaring problems with it - any other instructions that would need to be prevented?

Well, what you’ve proposed doesn’t appear address the problems that could be caused by a user-land process touching hardware or accessing RAM areas that are off-limits. Any thoughts on that? Also, what do you plan to do about that pesky STP instruction? Furthermore, how do you propose to deal with a user-land program having the following instruction sequence?

Code: Select all

          sei
          wai

pjdennis · Post by **pjdennis** » Thu Dec 07, 2023 11:57 pm

gfoot wrote:

The main option that strikes me is that if I just want to terminate the badly-behaving process, then why not trigger a reset at that point?

...

So at least within the scope of what I want from an MMU, this seems like it might be a reasonable solution instead of bus snooping. Are there any glaring problems with it - any other instructions that would need to be prevented?

This does seem like a simple solution that goes directly to the scenario that requires recovery (process has prevented control from returning to the kernel.) I'm guessing the watchdog timer duration would be a bit longer than the time slice duration, since the watchdog timer will be reset from within the kernel.

The main issue I could see is a negative impact on handling of time-sensitive interrupts. A one-off miss of e.g. a serial data byte transfer might not be a major problem, but a malicious process, depending on OS capabilities, might be able to create a denial of service situation by repeatedly spawning a child process that disables interrupts. Of course you might be more concerned about accidental programming issues than deliberate misbehavior, though I guess you could handle this scenario by recursively killing the parent user process of any process caught by the watch dog.

I can't think of any instructions other than sei, plp, stp and wai that would be problematic on the 65C02. I believe I saw on the other thread that you would keep hardware out of the memory map of user processes to avoid any direct interference.

gfoot · Post by **gfoot** » Fri Dec 08, 2023 5:14 am

BigDumbDinosaur wrote:

In any case, running a multitasking environment on the NMOS 6502 is mostly an exercise in futility, in my opinion. It can be done with the 65C02 (I’ve been there and done that), but efforts to do so will be hampered by the C02’s inability to restart an instruction that had to be aborted due to a page fault, access violation, etc.

I'm interested in why that's seen as so important. I can see that it's required for things like virtual memory, copy-on-write, and memory-mapped files - which are great features to support - but I think they are icing on the cake, and I'd be willing to accept that a 65C02 is not cut out for those features, at least not without a much more active MMU (e.g. a coprocessor standing by to fix things up mid-instruction, like Andre's system has).

BigDumbDinosaur wrote:

Although not designed for use in a preemptive, multitasking environment, any effort expended in setting up such an arrangement would be far more profitable with the 65C816, mainly because it has an ABORT interrupt, as well as features that make it easier for system logic to know what is happening at any given instant. For example, monitoring VDA and VPA will always tell logic where the 816 is in the instruction cycle. You don’t have that with the 65C02, whose SYNC output only tells you when an instruction opcode is being fetched. That won’t help in trying to police for instructions that would “touch” memory outside of the allowed address range for the particular program that is currently executing.

Early ARM CPUs had a similar system, and extensive documentation on exactly what the abort handler needed to do to "unpick" the instruction that was aborted. Most instructions didn't need any work, but some were quite thorny IIRC. I guess it was as much as they could afford to support in the CPU, but just enough that it was possible for the OS to pick up the pieces. It is harder with something like the 6502 instruction set where so many instructions are pretty much impossible to undo.

BigDumbDinosaur wrote:

Your main concern would be what happens if a user-land process disables IRQs. That’s something you could address with a watchdog timer wired to NMI. Within your IRQ handler (not the NMI handler), you’d have a code snippet to reset the watchdog on each IRQ. If IRQ processing ceases, such as due to an application executing SEI, the watchdog would eventually time out and force an NMI. The NMI handler’s job would be to re-enable IRQs by rewriting the stack copy of SR (status register) that was pushed by the MPU when it acknowledged the NMI. Upon RTIing out of the NMI handler, the i bit in SR will be cleared and IRQ processing will resume. With a 100 Hz jiffy IRQ, for example, you could set the watchdog to time out in 100-or-so milliseconds, minimizing the risk of deadlock. Presumably, your IRQ handler won’t take 100ms to complete.

Absolutely, I think we're on the same page - I was planning to hold the timer in reset whenever in supervisor mode. However, I hadn't thought of having the NMI handler simply reset the I bit to unblock the regular scheduler - I'd assumed it would also run the scheduler to select a new process itself. Just resetting the bit is much simpler in principle, and I like it, except that it doesn't help with the STP case. If we need to use resets for that, then I think we might as well also use resets for killing processes that have disabled interrupts for too long.

BigDumbDinosaur wrote:

Quote:

There is potentially some use for user processes to be able to disable interrupts - one example is atomic access to a multi-byte quantity, perhaps something that is updated by an interrupt or another process.

Why would that matter? If your preemptive, multitasking kernel is correctly written, a context change in such a situation should cause no problems whatsoever when your preempted process runs again.

The main case I had in mind was shared memory, but it's not something I've thought through a lot. I was considering the overheads of making system calls, and whether there might be some value in having some alternate interfaces available to user processes. For example, a shared page of memory containing read-only data about the system - e.g. elapsed time field, a list of keys currently held down, that sort of thing - updated by the kernel, so that user code can make some decisions based on this data, at least where there are no side-effects needed. In this case it's possible that an interrupt occurs leading to the elapsed time being updated, while the user process is halfway through reading the old value. Another case is where two processes are sharing a page of memory for IPC - given the lack of any atomic multi-byte operations in the 6502, it could be valuable to have a way to briefly prevent other processes running while you fetch some data. Again not something I've thought through very much.

BigDumbDinosaur wrote:

The logical thing to do if you are going to try to regain control via a hard reset is to soft-vector reset to code that will fix up the stack pointer, make sure IRQs are enabled, and decimal arithmetic is cleared, and then re-enter the kernel at a designated point. The only significant problem I see with this approach will be in figuring out who the offender is so it can be removed from the kernel’s run queue.

That's a good point, if the vectors are in writable memory we can just update those, I think that could work really well in my current design. I don't think it would be hard to work out which process was running - it will be whatever process the scheduler last resumed, and we can store the process ID in kernel RAM.

BigDumbDinosaur wrote:

Well, what you’ve proposed doesn’t appear address the problems that could be caused by a user-land process touching hardware or accessing RAM areas that are off-limits. Any thoughts on that?

User processes shouldn't have any access to I/O devices, because they shouldn't be in their memory map at all. My user processes only have paged RAM, across their entire address space. For logical pages that haven't been mapped to any RAM, I'm thinking of defaulting them to map reads to a page initialized with zeros, and writes to another dummy page of physical RAM, so that these operations have specific, safe effects. Beyond this low-effort approach, it might not be hard to have the hardware actually treat these page mappings as invalid, so that the kernel can at least terminate the process and report the exception.

BigDumbDinosaur wrote:

Also, what do you plan to do about that pesky STP instruction? Furthermore, how do you propose to deal with a user-land program having the following instruction sequence?

Code: Select all

          sei
          wai

My current (new) plan is to use timer-based IRQs to let the scheduler run, and probably also run it after every system call. Then use an additional timer to trigger reset if for whatever reason this IRQ doesn't achieve its purpose - whether that's due to SEI/WAI, STP, etc. For a while I thought about using the RDY pin to detect STP and WAI, but didn't see much benefit to be gained from that.

pjdennis wrote:

The main issue I could see is a negative impact on handling of time-sensitive interrupts. A one-off miss of e.g. a serial data byte transfer might not be a major problem, ...

I'd like to still avoid that, and have considered having a timer that runs whenever IRQ is low but the system is not in supervisor mode, so that if an IRQ is pending for longer than a tolerable period the process gets killed. The regular watchdog timer is going to need to be rather long, and only really works to break out of user processes that are completely blocking the system with STP etc.

pjdennis wrote:

but a malicious process, depending on OS capabilities, might be able to create a denial of service situation by repeatedly spawning a child process that disables interrupts. Of course you might be more concerned about accidental programming issues than deliberate misbehavior, though I guess you could handle this scenario by recursively killing the parent user process of any process caught by the watch dog.

This is still a vulnerability. I'm not sure how much we can hope to do about it though - I think even modern OSes running on much more suitable hardware still have trouble with this.

fachat · Post by **fachat** » Sun Dec 10, 2023 3:27 am

Just a quick thought on the watchdog timer.

Wouldn't it be relatively easy to just use /IRQ to start a watchdog - even a 555 timer could suffice I'd think - to trigger an NMI if the IRQ is not serviced quickly enough?

gfoot · Post by **gfoot** » Sun Dec 10, 2023 11:30 am

Yes, my current design now actually works that way - it counts user-mode cycles with a pending IRQ, and resets on entering supervisor mode, so in theory the limit can be very low and we can kill rogue processes quickly. Hopefully I'll post an updated full-computer schematic on the other thread later today including this detail.

MMU units for 6502 and 65816

Re: MMU units for 6502 and 65816

Re: MMU units for 6502 and 65816

Re: MMU units for 6502 and 65816

Re: MMU units for 6502 and 65816

Re: MMU units for 6502 and 65816

Re: MMU units for 6502 and 65816

Re: MMU units for 6502 and 65816

Re: MMU units for 6502 and 65816