AndrewP wrote:
If an NMI occurs just after the stack is updated but before the RTI triggers an exit from Kernal mode then random Kernal memory will be stomped by the interrupt's stack pushes.
I see what you mean - it's not a problem for me on the 6502 because it will only corrupt page 1, and there won't be anything interesting in there anyway - but as I understand it, on the 65816 the stack pointer can be anywhere in the bank, and it's probably not viable to have the kernel just not use bank zero at all!
So it is then necessary to be able to mask NMIs, and...
Quote:
The problem occurs in the "Disable NMI (in software)" portion - there is no easy way to guarantee that an NMI cannot occur during it. If an NMI occurs during the instruction that disables NMIs then they could be re-enabled and then another NMI could occur after the stack pointer is changed but before the RTI. It's a corner within a corner but random unexplained program crashes are the suck.
As I understand it, the issue is that an NMI could have been latched just prior to the disabling, but not processed yet - this is something I observed in controlled tests recently. I'm not sure that it is specifically related to having multiple sources of NMIs though.
Could you do something like this to work around it?
Code:
1. disable NMIs
2. are NMIs disabled?
3. if not, loop back to 1
I think this would resolve the race condition as if the NMI crosses with the disabling instruction, then the NMI will get processed after it, re-enabling NMIs on its return, but then the test in step 2 will fail and you'll loop back to try again.
Quote:
(Another good side effect of restricting interrupts that can occur in Kernal mode is that probably none will. There are unlikely to be nested Kernal interrupts and that means I can swap out the 74F269 up/down counter I'm using for an 74HCT193 - an IC that's much easier to source.)
I think there can still be situations where you want interrupts to get processed, otherwise the kernel is restricted to very short operations. Imagine something like clearing a page of memory - you don't really want interrupts disabled while you do it, it may work but will probably delay I/O responses more than is acceptable. You could offload that to a user-mode helper process of course, as is quite commonly done in modern operating systems, or find a way to support the nested interrupts reliably.
Regarding IC choice there, I had considered doing something like this using a two-way shift register, functioning as a one-bit-wide stack. At the moment though I think I'm going to be able to manage this in software - though it's early days.