Simple two-mode memory map implementation

gfoot · Post by **gfoot** » Sat Nov 25, 2023 5:23 pm

Gordon's thread about zero page usage reminded me of something I meant to try, so I've sketched out a circuit for it here. It was originally just a quirky idea, but in the end it seems to be simple enough that I think it might actually be practically useful. One day I ought to actually gather all the quirky ideas together and make a coherent system that uses them all at once!

So a few years ago I was investigating preemptive multiprocessing with an MMU, inspired by Andre's past work. A difference in my system was that all of the I/O space was hidden from user programs - so only the supervisor could see it. This meant that the address space for user programs was very simple - they had up to 64K of RAM, and that was it. The OS on the other hand didn't have much use for large amounts of its own RAM, but needed a lot of ROM and some I/O in the address space, and the ability to read and write user RAM.

I think some elements of this scheme can be useful in non-multiprocess systems, without a proper MMU. The advantage of using a two-mode system is that the complex address decoding needed for I/O, vectors, ROM and RAM, only apply for the OS code, and as it is not very demanding of overall address space, we can be quite lazy about how we do that, without incurring any limitations on the amount of RAM visible to our "user" mode code. And in fact the hardware required to implement these two modes is pretty basic:

This diagram shows a 74HC175 quad D flipflop register, a 74HC157 quad 2-to-1 multiplexer, and a 74HC139 dual 2-to-4 decoder. The 74HC175 could be replaced by any flipflop-style register (273, 374, 574, etc).

This is enough to allow user mode code to access its own 64K of RAM, and supervisor code to access 16K of ROM, 16K of RAM, and three I/O devices, as well as being able to access any part of the user-mode RAM through a configurable window.

On startup, and on any interrupt, the bank/mode register gets reset to zeros, putting the system in supervisor mode. In this mode, the memory map is as follows:

Code: Select all

$C000-$FFFF    OS ROM
$B000-$BFFF    Bank register interface
$A000-$AFFF    Floppy controller
$9000-$9FFF    ACIA
$8000-$8FFF    VIA
$4000-$7FFF    User RAM access window
$0000-$3FFF    OS RAM

So the OS has ROM at the top and RAM at the bottom, as usual, and in the middle it has the I/O devices and a window to access user RAM. It can choose which part of user RAM to access using the bank register - any write to $B000-$BFFF will set the bank register to the bottom four bits of the data bus. The bottom three bits of the bank register form A15, A14, and A13, for the user RAM. This allows it to access any data required by OS system calls made by the user code.

If the supervisor code writes the bank register with bit D3 set, then the system goes into user mode. At that point the address space changes completely, and the whole address space is just user RAM. The OS RAM is not visible in this mode, and nor is any I/O. There would need to be a convention for execution to continue - e.g. the OS will leave PC at a specific address, which - in user RAM - should contain $40 (RTI), so that this is then executed after the switch to user mode.

Last time I did this I had a lot of problems with returning to user mode without having at least some points where a badly-timed NMI could cause errors, but I don't think that would be a problem here, as the return to user mode is atomic enough.

The elephant in the room is the /BANK signal in my schematic, which I think ought to be qualified by PHI2, but that's a fairly minor change in principle - I think adding one more 74HC139 would allow us to do that and also generate the OE/WE signals for the RAM, and it's possible we could instead use U30B for this, and add a separate 74HC138 decoder to provide access to more different I/O devices if that's useful.

fachat · Post by **fachat** » Sat Nov 25, 2023 9:40 pm

Interesting approach. I wonder:

- if on VP supervisor more is enabled, the interrupt return address and status is stored on the supervisor stack, right?
- when you return from supervisor more by storing into the left '175, you are still in supervisor mode, and an RTI would read the return address from user space stack?

Will the software be required to copy that information from supervisor stack to user stack on (the beginning of makes more sense) any interrupt / OS call?

gfoot · Post by **gfoot** » Sat Nov 25, 2023 9:55 pm

I thought about that, and the OS could do it, but I also don't think it will be necessary as VPB should only go low after the address and flags have already been pushed to the userspace stack. The OS would have pushed the registers to its own stack though, so would need to restore them from there.

Proxy · Post by **Proxy** » Sat Nov 25, 2023 10:14 pm

hmm, i think an alternative system for switching back to User Mode would be to wait until RTI is being fetched and then switch back before any stack operations occour.
for example, enabling User Mode wouldn't directly do anything, instead it would start keeping track of the 65C02's SYNC pin. (or the 65816's VPA and VDA pins)
assuming the OS is written with the hardware in mind (and you want to avoid constantly checking the data bus for the opcode), an RTI should follow directly after the write that switched to User Mode.
so it should be as simple as waiting for the next time SNYC is asserted, wait for that cycle to finish, and then swap the memory layout to the User one.
that way when RTI starts pulling values from the stack it will be back on the User stack and continue execution normally afterwards.

doesn't seem like it would be that difficult to implement, and it should be more convenient than having the PC be at a specific value when switching back to land on an RTI in User Space.

fachat · Post by **fachat** » Sat Nov 25, 2023 11:02 pm

gfoot wrote:

I thought about that, and the OS could do it, but I also don't think it will be necessary as VPB should only go low after the address and flags have already been pushed to the userspace stack. The OS would have pushed the registers to its own stack though, so would need to restore them from there.

Ah yes, I again fell for that point. Of course return address and status are written before vector pull...

gfoot · Post by **gfoot** » Sun Nov 26, 2023 12:20 am

Proxy wrote:

hmm, i think an alternative system for switching back to User Mode would be to wait until RTI is being fetched and then switch back before any stack operations occour.

Nice idea, that would make things easier.

I've remembered what the problem I had with NMI was last time around - if an interrupt occurs while in super mode, then when we return from that interrupt we need to return to super mode as well, not user mode. It is tricky to see how to do this, especially the case where an NMI occurs just after another interrupt or BRK, before the OS code has had a chance to get in a good position to be able to handle the nested NMI. One option is just to not use NMIs for anything.

Proxy · Post by **Proxy** » Sun Nov 26, 2023 12:31 am

if you pass the NMI/IRQ signals through some logic before going to the CPU, you can implement a "global Interrupt disable" using the Supervisor flag, so as long as the CPU is in Supervisor Mode, no other interrupt could even reach the CPU.

obviously that means nested interrupts are completely impossible, but as long as the NMI signal is nothing more than a jiffy interrupt or something similarly irrelevant when missed, and IRQ stays low until handled, it should work fine.

godammit, now i'm kinda itching to make a system around this idea, especially since all the logic (plus a timer interrupt) should easily fit into an ATF1508.
i also wonder if this kind of setup would be supported by FUZIX or GeckOS.

and3rson · Post by **and3rson** » Sun Nov 26, 2023 1:40 am

This is really nice, I've been thinking about ways to do proper kernel-space / user-space isolation and your approach looks very promising. How are you planning to implement syscalls if (in user mode) entire address space is mapped to RAM?

I had some ideas to use undocumented no-ops to invoke "kernel mode" (a la software interrupts in x86), but I never got to do it.

gfoot · Post by **gfoot** » Sun Nov 26, 2023 1:54 am

The safest way to do syscalls is via BRK, which will trigger the same path through the interrupt vectors and switch to supervisor mode. The OS can read any data necessary from userspace (e.g. the byte or other data after the BRK instruction, or the usermode stack) or from registers. This way we only need to make sure interrupts work, and there's never another way to enter supervisor mode.

Proxy, NMI is the only one that really causes problems. Having a hardware option to disable NMIs would be handy; or we could just not use them. In my other project I was only using them to force user processes to yield, for preemptive multitasking. Even for that use case, they could be masked in hardware except in user mode.

Proxy · Post by **Proxy** » Sun Nov 26, 2023 2:28 am

on another thought, i feel like this should also work on the 65816 in native mode.

so for user mode the upper address Byte is simply ignored and always mapped to it's designated 64k Bank in physical RAM. That way the user program can't modify any bank registers to reach outside it's memory area, like a basic memory protection model.
Then when an interrupt hits it's the same as for the 65C02, VPB causes the system to switch to supervisor mode and the memory map to swap with IO and ROM being accessible.
but it also re-enables the upper address byte of the CPU to work correctly. this means that while in supervisor mode the CPU has direct access to all of physical RAM allowing it to access all processes without needing a memory window, which also frees up more space in bank 0 for the OS itself.

i feel like this wouldn't be that complicated to implement either, especially if you only allow for 1 user process like with your concept, as then the OS could sit in physical bank 0, and the user program in physical bank 1.
And you get all the benifits of using a 65816, like 16-bit registers, movable direct page and a larger stack.

sadly my SBC has no pins left on the CPLD, otherwise i could've hooked VPB up to it and tried to implemented this logic to see if it would work.
though i guess i could re-use one of the pins that only goes to the expansion port without having to cut any trace on the PCB.... hmm.
i might try this out, and if i do i'll report back.

BigDumbDinosaur · Post by **BigDumbDinosaur** » Sun Nov 26, 2023 7:58 am

What y’all are discussing was talked about here, although it meanders quite a bit.

Proxy · Post by **Proxy** » Sun Nov 26, 2023 11:34 am

very interesting that you had a very similar idea almost 12 years ago (jesus christ, 2012 is almost 12 years ago)
though you're idea a little more complex.
for example with your concept, when a user process tries to access outside it's own bank it would raise an exception and have the OS handle it, but the idea i had was to hardlock the upper address byte, effectively mirroring a single 64k bank across the entire 16MB address space. so while the process could easily use JSL, JML, or abolute long addressing modes to accesss another bank, they all map to the same physical 64kB area in RAM, so there is literally no escape for the process.

in terms of logic it seems a lot simplier to set up and also means you don't need to look out for any "illegal" jumps or memory accesses.

gfoot · Post by **gfoot** » Sun Nov 26, 2023 12:06 pm

I think it is simpler, but I would have thought you'd want user processes to be able to access more than 64K on a 65816. I'd probably be more inclined to let the user process have free control over banking within its sandbox, even bank 0, but have the hardware keep that separate from the memory used by the OS which needs to be protected. This is in line with what my 6502 suggestion above would do, where the OS and user program have generally non-overlapping physical memory and the OS has a different method for accessing user memory.

I think the OS has much less need for large amounts of its own memory, so could easily run constrained to a single bank on the 65816. Potentially it gets its own 32K RAM chip, and the user code gets a 512K one, or something like that. Then the hardware maps the user's 512K RAM as banks 0-15, while when in supervisor mode bank 0 is the private RAM and ROM, maybe bank 16 is I/O, and perhaps banks 128-133 allow access to the user RAM.

BDD did your idea go beyond your linked post there?

gfoot · Post by **gfoot** » Sun Nov 26, 2023 2:02 pm

Another thought here - as these schemes constrain the complex address decoding to a specific long-term mode, they seem ideally suited to running the user mode at a very fast clock speed. It's a bit like what I did in my Fast PDIP system in that respect. So the super mode could run at maybe 8MHz, user mode at 32 MHz, and as transitions between modes are rare, the clock stretching can be done more efficiently than in my other designs which generally end up wasting cycles synchronising the clocks.

Simple two-mode memory map implementation

Simple two-mode memory map implementation

Re: Simple two-mode memory map implementation

Re: Simple two-mode memory map implementation

Re: Simple two-mode memory map implementation

Re: Simple two-mode memory map implementation

Re: Simple two-mode memory map implementation

Re: Simple two-mode memory map implementation

Re: Simple two-mode memory map implementation

Re: Simple two-mode memory map implementation

Re: Simple two-mode memory map implementation

Re: Simple two-mode memory map implementation

Re: Simple two-mode memory map implementation

Re: Simple two-mode memory map implementation

Re: Simple two-mode memory map implementation