Idea for multitasking support on 65Org16 (or 65Org32)
Posted: Tue Apr 17, 2012 10:01 pm
Garth's view of the 65Org32 includes some kind of base register for address relocation, to allow for several tasks to exist on the machine and for their physical addresses to be changed under their feet without them needing to be fiddled with.
(I think the Amiga managed cooperative multitasking with load-time relocation, and put up with any memory fragmentation. I think the Macintosh allocated memory through double-indirection which allowed for defragmentation but at a runtime performance penalty. In both cases software took the strain. The 65816 has two bank registers to allow data and program to reside in foreign banks, which may allow for a task-in-bank model of programming. To deal with loading the banks, it also supplies a long addressing mode, and to jump between banks it stacks and unstacks the program bank register on interrupts and RTI.)
I think we need as little as a single register and a single bit in the status register to support multitasking and relocation, and do without the other complexities of the '816.
The register is a full 32 bits, called TBR for task base register, and the status bit is called T for task or translate. At reset and after BRK or interrupt T is set to zero, and TBR is ignored. When T is one, TBR is added to every address as it leaves the core. That's it. Oh, and we need a single opcode XTR which exchanges the task register with A.
The effect is that we have a kind of supervisor mode at reset (T=0) which gives access to physical memory with untranslated addresses. We can load programs to their intended locations and access I/O devices at their actual addresses. (There isn't any protection here, so it isn't truly a supervisor mode.)
From untranslated (T=0) mode we can setup a status word with T=1, exchange a suitable base value into TBR, and use RTI to jump into a user program.
For example, we might have a program assembled with a start address of $0000_2000 which we load at $0001_2000, so we set TBR to $0001_0000, push $0000_2000, push $ffff or similar(*) and then RTI. The next fetch will be from logical address $0000_2000 but the TBR adds $0001_0000 and we fetch from $0001_2000 which is where we placed the program. If the program starts with a jump to $0000_2100, it will fetch from $0001_2100, and so on.
(*) I don't know which bit is the T bit, so I set them all!
When the program needs an OS service, perhaps to perform some I/O, it uses a BRK, which pushes the PSW and sets T=0 to put us back into untranslated mode. The service routine can then access the I/O device. The address of the BRK handler is of course untranslated, as with other vectors.
The untranslated mode could block-move tasks while they are not running to defragment memory, on a coarse grain, so long as it adjusts the base register which corresponds to each task.
Note that each task has its own self-contained memory space which includes stack and (for 65Org16) zero page: we don't have several bank or base registers for each purpose.
As a refinement, a second register could impose an upper limit to a task's memory accesses (either silently, which is simple, especially if it is by applying a mask, or by aborting the instruction, which is more difficult on a core with no abort.) We don't need an extra status bit or even an extra opcode: the exchange opcode can just exchange the two translation registers with two normal registers.
Does this seem to work?
Ed
(I think the Amiga managed cooperative multitasking with load-time relocation, and put up with any memory fragmentation. I think the Macintosh allocated memory through double-indirection which allowed for defragmentation but at a runtime performance penalty. In both cases software took the strain. The 65816 has two bank registers to allow data and program to reside in foreign banks, which may allow for a task-in-bank model of programming. To deal with loading the banks, it also supplies a long addressing mode, and to jump between banks it stacks and unstacks the program bank register on interrupts and RTI.)
I think we need as little as a single register and a single bit in the status register to support multitasking and relocation, and do without the other complexities of the '816.
The register is a full 32 bits, called TBR for task base register, and the status bit is called T for task or translate. At reset and after BRK or interrupt T is set to zero, and TBR is ignored. When T is one, TBR is added to every address as it leaves the core. That's it. Oh, and we need a single opcode XTR which exchanges the task register with A.
The effect is that we have a kind of supervisor mode at reset (T=0) which gives access to physical memory with untranslated addresses. We can load programs to their intended locations and access I/O devices at their actual addresses. (There isn't any protection here, so it isn't truly a supervisor mode.)
From untranslated (T=0) mode we can setup a status word with T=1, exchange a suitable base value into TBR, and use RTI to jump into a user program.
For example, we might have a program assembled with a start address of $0000_2000 which we load at $0001_2000, so we set TBR to $0001_0000, push $0000_2000, push $ffff or similar(*) and then RTI. The next fetch will be from logical address $0000_2000 but the TBR adds $0001_0000 and we fetch from $0001_2000 which is where we placed the program. If the program starts with a jump to $0000_2100, it will fetch from $0001_2100, and so on.
(*) I don't know which bit is the T bit, so I set them all!
When the program needs an OS service, perhaps to perform some I/O, it uses a BRK, which pushes the PSW and sets T=0 to put us back into untranslated mode. The service routine can then access the I/O device. The address of the BRK handler is of course untranslated, as with other vectors.
The untranslated mode could block-move tasks while they are not running to defragment memory, on a coarse grain, so long as it adjusts the base register which corresponds to each task.
Note that each task has its own self-contained memory space which includes stack and (for 65Org16) zero page: we don't have several bank or base registers for each purpose.
As a refinement, a second register could impose an upper limit to a task's memory accesses (either silently, which is simple, especially if it is by applying a mask, or by aborting the instruction, which is more difficult on a core with no abort.) We don't need an extra status bit or even an extra opcode: the exchange opcode can just exchange the two translation registers with two normal registers.
Does this seem to work?
Ed