Code: Select all
gfoot's multitasking computer bootstrap ROM
Checking ZP/stack... OK
Loading second stage...
Stage 1.5
Checking private RAM works... OK
Loading second stage...
mtos kernel starting
32K found
Testing paged RAM... init, test, OK
Clearing paged RAM... OK
debugspawnprocess: got page 01
debugspawnprocess: got PID 01
debugspawnprocess: got page 02
debugspawnprocess: got PID 02
debugspawnprocess: got page 03
debugspawnprocess: got PID 03
debugspawnprocess: got page 04
debugspawnprocess: got PID 04
scheduler: running process 01
scheduler: running process 02
scheduler: running process 03
scheduler: running process 04
scheduler: running process 01
scheduler: running process 02
scheduler: running process 03
Code: Select all
loop:
brk : .byte $00
bra loop
Code: Select all
8D08 : 40 : RTI : 6 : A=00 X=00 Y=00 SP=FF N=0 V=1 D=0 I=0 Z=1 C=0
0200 : 00 00 : BRK #00 : 7 : A=00 X=00 Y=00 SP=FC N=0 V=1 D=0 I=1 Z=1 C=0
8C6F : 8D 00 F4 : STA F400 : 4 : A=00 X=00 Y=00 SP=FC N=0 V=1 D=0 I=1 Z=1 C=0
8C72 : 68 : PLA : 4 : A=72 X=00 Y=00 SP=FD N=0 V=1 D=0 I=1 Z=0 C=0
8C73 : 48 : PHA : 3 : A=72 X=00 Y=00 SP=FC N=0 V=1 D=0 I=1 Z=0 C=0
8C74 : 9C 0F 86 : STZ 860F : 4 : A=72 X=00 Y=00 SP=FC N=0 V=1 D=0 I=1 Z=0 C=0
8C77 : 29 10 : AND #10 : 2 : A=10 X=00 Y=00 SP=FC N=0 V=1 D=0 I=1 Z=0 C=0
8C79 : D0 11 : BNE 8C8C : 3 : A=10 X=00 Y=00 SP=FC N=0 V=1 D=0 I=1 Z=0 C=0
8C8C : 8E 01 F4 : STX F401 : 4 : A=10 X=00 Y=00 SP=FC N=0 V=1 D=0 I=1 Z=0 C=0
8C8F : 20 7F 8D : JSR 8D7F : 6 : A=10 X=00 Y=00 SP=FA N=0 V=1 D=0 I=1 Z=0 C=0
8D7F : A6 06 : LDX 06 : 3 : A=10 X=02 Y=00 SP=FA N=0 V=1 D=0 I=1 Z=0 C=0
8D81 : BD 00 80 : LDA 8000,X : 4 : A=02 X=02 Y=00 SP=FA N=0 V=1 D=0 I=1 Z=0 C=0
8D84 : 8D 00 91 : STA 9100 : 4 : A=02 X=02 Y=00 SP=FA N=0 V=1 D=0 I=1 Z=0 C=0
8D87 : BA : TSX : 2 : A=02 X=FA Y=00 SP=FA N=1 V=1 D=0 I=1 Z=0 C=0
8D88 : E8 : INX : 2 : A=02 X=FB Y=00 SP=FA N=1 V=1 D=0 I=1 Z=0 C=0
8D89 : E8 : INX : 2 : A=02 X=FC Y=00 SP=FA N=1 V=1 D=0 I=1 Z=0 C=0
8D8A : E8 : INX : 2 : A=02 X=FD Y=00 SP=FA N=1 V=1 D=0 I=1 Z=0 C=0
8D8B : E8 : INX : 2 : A=02 X=FE Y=00 SP=FA N=1 V=1 D=0 I=1 Z=0 C=0
8D8C : 38 : SEC : 2 : A=02 X=FE Y=00 SP=FA N=1 V=1 D=0 I=1 Z=0 C=1
8D8D : BD 00 11 : LDA 1100,X : 4 : A=02 X=FE Y=00 SP=FA N=0 V=1 D=0 I=1 Z=0 C=1
8D90 : E9 01 : SBC #01 : 2 : A=01 X=FE Y=00 SP=FA N=0 V=0 D=0 I=1 Z=0 C=1
8D92 : 85 00 : STA 00 : 3 : A=01 X=FE Y=00 SP=FA N=0 V=0 D=0 I=1 Z=0 C=1
8D94 : E8 : INX : 2 : A=01 X=FF Y=00 SP=FA N=1 V=0 D=0 I=1 Z=0 C=1
8D95 : BD 00 11 : LDA 1100,X : 4 : A=02 X=FF Y=00 SP=FA N=0 V=0 D=0 I=1 Z=0 C=1
8D98 : E9 00 : SBC #00 : 2 : A=02 X=FF Y=00 SP=FA N=0 V=0 D=0 I=1 Z=0 C=1
8D9A : C9 10 : CMP #10 : 2 : A=02 X=FF Y=00 SP=FA N=1 V=0 D=0 I=1 Z=0 C=0
8D9C : 90 15 : BCC 8DB3 : 3 : A=02 X=FF Y=00 SP=FA N=1 V=0 D=0 I=1 Z=0 C=0
8DB3 : 09 10 : ORA #10 : 2 : A=12 X=FF Y=00 SP=FA N=0 V=0 D=0 I=1 Z=0 C=0
8DB5 : 85 01 : STA 01 : 3 : A=12 X=FF Y=00 SP=FA N=0 V=0 D=0 I=1 Z=0 C=0
8DB7 : B2 00 : LDA (00) : 5 : A=00 X=FF Y=00 SP=FA N=0 V=0 D=0 I=1 Z=1 C=0
8DB9 : 0A : ASL A : 2 : A=00 X=FF Y=00 SP=FA N=0 V=0 D=0 I=1 Z=1 C=0
8DBA : AA : TAX : 2 : A=00 X=00 Y=00 SP=FA N=0 V=0 D=0 I=1 Z=1 C=0
8DBB : E0 04 : CPX #04 : 2 : A=00 X=00 Y=00 SP=FA N=1 V=0 D=0 I=1 Z=0 C=0
8DBD : B0 03 : BCS 8DC2 : 2 : A=00 X=00 Y=00 SP=FA N=1 V=0 D=0 I=1 Z=0 C=0
8DBF : 7C 77 8D : JMP (8D77,X) : 6 : A=00 X=00 Y=00 SP=FA N=1 V=0 D=0 I=1 Z=0 C=0
8D7B : 18 : CLC : 2 : A=00 X=00 Y=00 SP=FA N=1 V=0 D=0 I=1 Z=0 C=0
8D7C : 60 : RTS : 6 : A=00 X=00 Y=00 SP=FC N=1 V=0 D=0 I=1 Z=0 C=0
8C92 : AE 01 F4 : LDX F401 : 4 : A=00 X=00 Y=00 SP=FC N=0 V=0 D=0 I=1 Z=1 C=0
8C95 : 90 E9 : BCC 8C80 : 3 : A=00 X=00 Y=00 SP=FC N=0 V=0 D=0 I=1 Z=1 C=0
8C80 : A5 06 : LDA 06 : 3 : A=02 X=00 Y=00 SP=FC N=0 V=0 D=0 I=1 Z=0 C=0
8C82 : 8D 0F 86 : STA 860F : 4 : A=02 X=00 Y=00 SP=FC N=0 V=0 D=0 I=1 Z=0 C=0
8C85 : AD 00 F4 : LDA F400 : 4 : A=00 X=00 Y=00 SP=FC N=0 V=0 D=0 I=1 Z=1 C=0
8C88 : 2C 01 86 : BIT 8601 : 4 : A=00 X=00 Y=00 SP=FC N=0 V=0 D=0 I=1 Z=1 C=0
8C8B : 40 : RTI : 6 : A=00 X=00 Y=00 SP=FF N=0 V=1 D=0 I=0 Z=1 C=0
0202 : 80 FC : BRA 0200 : 3 : A=00 X=00 Y=00 SP=FF N=0 V=1 D=0 I=0 Z=1 C=0
0200 : 00 00 : BRK #00 : 7 : A=00 X=00 Y=00 SP=FC N=0 V=1 D=0 I=1 Z=1 C=0
8C6F : 8D 00 F4 : STA F400 : 4 : A=00 X=00 Y=00 SP=FC N=0 V=1 D=0 I=1 Z=1 C=0
8C72 : 68 : PLA : 4 : A=72 X=00 Y=00 SP=FD N=0 V=1 D=0 I=1 Z=0 C=0
...
There is a lot of overhead to read the operand to the BRK instruction. I was trying to avoid needing the user to use registers for passing parameters, but I think that's a bad choice - it will cut a lot out if I just use A to select the syscall type, and ignore the BRK operand.
There's also the usual dilemma over how to determine that it was a BRK - here I did it the Acorn way, saving A somewhere and pulling the flags from the stack. It would be quite simple to add hardware to latch this flag and make it easy to read and branch on without needing any registers - we just need to track the last state of the D4 data line during a write operation, then the IRQ handler can begin with a BIT instruction to read that back. It would save a few instructions, or could even be used to make a different vector get called in hardware - however I don't think that's worth it as it's just a BIT and a branch that would be saved.
Interrupt latency is something I've been conscious of and I'm interested in seeing how bad it is in practice, and what can be done to improve it. As yet I'm not using hardware interrupts for anything other than the preempting timer, but there's a 65C51 there which can issue receive interrupts, and I have it connnected to the VIA's PB6 so that VIA T2 can issue transmit interrupts. I may also use the VIA's shift register, CB1 and CB2 for SD cards, PS/2, etc, at some point - so I will get to try these things out soon.
I wrote the code bearing this latency in mind, but there are no doubt more improvements to make. Here's the source code for the irqhandler routine that appears above:
Code: Select all
irqhandler:
.(
; This could be an IRQ or a BRK. We can check the stack to find out which.
; There's no need to be reentrant here, but while the active process is still selected,
; we mustn't write to zero page or the stack.
sta var_saveda
pla : pha
; Switch to process 0
stz PID
; Is it a BRK?
and #$10
bne isbrk
; Handle the interrupt
jsr irqhandler2
; Was it preempted? If so pick another process
bcs preempted
resume:
; If not, return to the same process
lda zp_prevprocess : sta PID
lda var_saveda
bit ENDSUPER
rti
isbrk:
stx var_savedx
jsr syscall
ldx var_savedx
bcc resume ; resume current process if carry clear, otherwise fall through
preempted:
; Save the process's context and run the scheduler. The process's flags and return
; address are already on its stack. We free up the Y register, and use it to index
; into the process register arrays.
phy ; save Y temporarily
ldy zp_prevprocess
lda var_saveda : sta var_process_regs_a,y ; restore A's value and save it
txa : sta var_process_regs_x,y ; save X's value
pla : sta var_process_regs_y,y ; restore Y's value and save it
tsx : txa : sta var_process_regs_sp,y ; save the stack pointer too
;jmp scheduler_run
.)
irqhandler2 is a fairly standard hardware interrupt handling routine that queries various devices to see what needs attention. It can prioritise urgent devices of course. It returns with the carry set or clear to indicate whether we should switch processes or resume the same one. Resuming the same process is more efficient, so it only sets the carry if the interrupt was due to the preempting timer running out.
Resuming the same process is efficient because we haven't really corrupted its state much - the stack pointer is exactly as it left it, the X and Y registers haven't been touched, and the A register has been saved in a cheap global variable. All we need to do is restore the PID register setting, restore the A register, and return to user mode (the "resume" block above). If we're switching processes though, at the very least we need to store all the process state somewhere it can stay for the long term, then figure out what to run next, and then restore that process's state. The "preempted" block above does the first bit of this, but there is at least as much work still to follow in scheduler_run. So, returning to the same process is better in general.
The two things that will really affect interrupt latency are increases in the code that runs before the interrupt is handled - which are minimal, in the above code that's really just the "stz PID" - but also, any other code that runs with interrupts disabled, including the recovery after the interrupt (things like running the scheduler) and syscalls. I haven't implemented this, but I think a decent mitigation for these is to re-enable interrupts while still in supervisor mode wherever possible. For example, almost all syscall code can execute with interrupts enabled - the only thing that will interrupt them is an actual hardware interrupt, so they'd only need to disable interrupts if they were accessing some complex state that's also manipulated by the interrupt handling code. It should be possible to avoid conflicts between these things in general without having to disable interrupts.
Similarly, the epilogue of the interrupt handler might be able to run with interrupts enabled - as soon as the hardware interrogation is complete, perhaps we can enable interrupts again. This would allow another interrupt to be serviced quickly, if one happened to come in while the previous one was still working out how best to return to user mode.
That said, except in the "preempt" case, where we are switching processes, the return to user mode is not too bad. Here's a full capture of the preempt interrupt being handled, and returning to a different process:
Code: Select all
0200 : : INTERRUPT !! : 7 : A=00 X=00 Y=00 SP=FC N=0 V=1 D=0 I=1 Z=1 C=0
8C6F : 8D 00 F4 : STA F400 : 4 : A=00 X=00 Y=00 SP=FC N=0 V=1 D=0 I=1 Z=1 C=0
8C72 : 68 : PLA : 4 : A=62 X=00 Y=00 SP=FD N=0 V=1 D=0 I=1 Z=0 C=0
8C73 : 48 : PHA : 3 : A=62 X=00 Y=00 SP=FC N=0 V=1 D=0 I=1 Z=0 C=0
8C74 : 9C 0F 86 : STZ 860F : 4 : A=62 X=00 Y=00 SP=FC N=0 V=1 D=0 I=1 Z=0 C=0
8C77 : 29 10 : AND #10 : 2 : A=00 X=00 Y=00 SP=FC N=0 V=1 D=0 I=1 Z=1 C=0
8C79 : D0 11 : BNE 8C8C : 2 : A=00 X=00 Y=00 SP=FC N=0 V=1 D=0 I=1 Z=1 C=0
8C7B : 20 09 8D : JSR 8D09 : 6 : A=00 X=00 Y=00 SP=FA N=0 V=1 D=0 I=1 Z=1 C=0
8D09 : AD 0D 86 : LDA 860D : 4 : A=C0 X=00 Y=00 SP=FA N=1 V=1 D=0 I=1 Z=0 C=0
8D0C : 10 08 : BPL 8D16 : 2 : A=C0 X=00 Y=00 SP=FA N=1 V=1 D=0 I=1 Z=0 C=0
8D0E : 2D 0E 86 : AND 860E : 4 : A=C0 X=00 Y=00 SP=FA N=1 V=1 D=0 I=1 Z=0 C=0
8D11 : 2C 0D 86 : BIT 860D : 4 : A=C0 X=00 Y=00 SP=FA N=1 V=1 D=0 I=1 Z=0 C=0
8D14 : 70 02 : BVS 8D18 : 3 : A=C0 X=00 Y=00 SP=FA N=1 V=1 D=0 I=1 Z=0 C=0
8D18 : A9 40 : LDA #40 : 2 : A=40 X=00 Y=00 SP=FA N=0 V=1 D=0 I=1 Z=0 C=0
8D1A : 8D 0D 86 : STA 860D : 4 : A=40 X=00 Y=00 SP=FA N=0 V=1 D=0 I=1 Z=0 C=0
8D1D : 38 : SEC : 2 : A=40 X=00 Y=00 SP=FA N=0 V=1 D=0 I=1 Z=0 C=1
8D1E : 60 : RTS : 6 : A=40 X=00 Y=00 SP=FC N=0 V=1 D=0 I=1 Z=0 C=1
8C7E : B0 17 : BCS 8C97 : 3 : A=40 X=00 Y=00 SP=FC N=0 V=1 D=0 I=1 Z=0 C=1
8C97 : 5A : PHY : 3 : A=40 X=00 Y=00 SP=FB N=0 V=1 D=0 I=1 Z=0 C=1
8C98 : A4 06 : LDY 06 : 3 : A=40 X=00 Y=01 SP=FB N=0 V=1 D=0 I=1 Z=0 C=1
8C9A : AD 00 F4 : LDA F400 : 4 : A=00 X=00 Y=01 SP=FB N=0 V=1 D=0 I=1 Z=1 C=1
8C9D : 99 00 F6 : STA F600,Y : 5 : A=00 X=00 Y=01 SP=FB N=0 V=1 D=0 I=1 Z=1 C=1
8CA0 : 8A : TXA : 2 : A=00 X=00 Y=01 SP=FB N=0 V=1 D=0 I=1 Z=1 C=1
8CA1 : 99 00 F7 : STA F700,Y : 5 : A=00 X=00 Y=01 SP=FB N=0 V=1 D=0 I=1 Z=1 C=1
8CA4 : 68 : PLA : 4 : A=00 X=00 Y=01 SP=FC N=0 V=1 D=0 I=1 Z=1 C=1
8CA5 : 99 00 F8 : STA F800,Y : 5 : A=00 X=00 Y=01 SP=FC N=0 V=1 D=0 I=1 Z=1 C=1
8CA8 : BA : TSX : 2 : A=00 X=FC Y=01 SP=FC N=1 V=1 D=0 I=1 Z=0 C=1
8CA9 : 8A : TXA : 2 : A=FC X=FC Y=01 SP=FC N=1 V=1 D=0 I=1 Z=0 C=1
8CAA : 99 00 F9 : STA F900,Y : 5 : A=FC X=FC Y=01 SP=FC N=1 V=1 D=0 I=1 Z=0 C=1
8CAD : A4 06 : LDY 06 : 3 : A=FC X=FC Y=01 SP=FC N=0 V=1 D=0 I=1 Z=0 C=1
8CAF : A2 00 : LDX #00 : 2 : A=FC X=00 Y=01 SP=FC N=0 V=1 D=0 I=1 Z=1 C=1
8CB1 : C8 : INY : 2 : A=FC X=00 Y=02 SP=FC N=0 V=1 D=0 I=1 Z=0 C=1
8CB2 : B9 00 F5 : LDA F500,Y : 4 : A=01 X=00 Y=02 SP=FC N=0 V=1 D=0 I=1 Z=0 C=1
8CB5 : D0 06 : BNE 8CBD : 3 : A=01 X=00 Y=02 SP=FC N=0 V=1 D=0 I=1 Z=0 C=1
8CE6 : 84 06 : STY 06 : 3 : A=02 X=00 Y=02 SP=FC N=0 V=1 D=0 I=1 Z=0 C=0
8CE8 : 8C 0F 86 : STY 860F : 4 : A=02 X=00 Y=02 SP=FC N=0 V=1 D=0 I=1 Z=0 C=0
8CEB : BE 00 F9 : LDX F900,Y : 4 : A=02 X=FC Y=02 SP=FC N=1 V=1 D=0 I=1 Z=0 C=0
8CEE : 9A : TXS : 2 : A=02 X=FC Y=02 SP=FC N=1 V=1 D=0 I=1 Z=0 C=0
8CEF : B9 00 F6 : LDA F600,Y : 4 : A=00 X=FC Y=02 SP=FC N=0 V=1 D=0 I=1 Z=1 C=0
8CF2 : 8D 00 F4 : STA F400 : 4 : A=00 X=FC Y=02 SP=FC N=0 V=1 D=0 I=1 Z=1 C=0
8CF5 : BE 00 F7 : LDX F700,Y : 4 : A=00 X=00 Y=02 SP=FC N=0 V=1 D=0 I=1 Z=1 C=0
8CF8 : B9 00 F8 : LDA F800,Y : 4 : A=00 X=00 Y=02 SP=FC N=0 V=1 D=0 I=1 Z=1 C=0
8CFB : A8 : TAY : 2 : A=00 X=00 Y=00 SP=FC N=0 V=1 D=0 I=1 Z=1 C=0
8CFC : AD 02 F4 : LDA F402 : 4 : A=04 X=00 Y=00 SP=FC N=0 V=1 D=0 I=1 Z=0 C=0
8CFF : 8D 05 86 : STA 8605 : 4 : A=04 X=00 Y=00 SP=FC N=0 V=1 D=0 I=1 Z=0 C=0
8D02 : AD 00 F4 : LDA F400 : 4 : A=00 X=00 Y=00 SP=FC N=0 V=1 D=0 I=1 Z=1 C=0
8D05 : 2C 01 86 : BIT 8601 : 4 : A=00 X=00 Y=00 SP=FC N=0 V=0 D=0 I=1 Z=1 C=0
8D08 : 40 : RTI : 6 : A=00 X=00 Y=00 SP=FF N=0 V=1 D=0 I=0 Z=1 C=0
Overall I don't think this is too bad in terms of overheads, but there is still room for improvement, and enabling interrupts as early as possible would mitigate a lot of the effects on interrupt latency.
Something else that's come up is the ease or difficulty of reading user process memory, and especially following pointers there. Here's the code for the syscall routine, that reads the BRK operand. I think the comments explain the context pretty well:
Code: Select all
syscall:
.(
; On entry:
;
; PID = 0; caller's PID is in zp_prevprocess
; A is undefined; caller's A is in var_saveda
; X is caller's, and has been saved to var_savedx
; Y is caller's
; SP is two lower than caller's (our return address was pushed)
; Ideally here we'd re-enable interrupts as soon as we can, as servicing a system call is not high priority.
; But that's an improvement for later.
; We need to determine the type of system call and act accordingly.
;
; We'll read the user process's LP 0 mapping, and write it into our LP 1 so that we can read the user stack from there.
; Then we'll have a user pointer to the BRK instruction, and can use it directly if it's into the user's LP 0, but will need
; to make another mapping change if it's to a different logical page in the user's address space.
ldx zp_prevprocess
lda $8000,x ; read process's LP 0 mapping
sta $9100 ; set our LP 1 read mapping to the same page
; Here we use "inx" so that it wraps properly and we avoid issues with SP > $FC
tsx
inx : inx : inx : inx ; SP => X and advance it to point at the PCL from the interrupt frame
; Copy the PC to zp_ptr, subtracting 1 as we go
sec : lda $1100,x : sbc #1 : sta zp_ptr
inx : lda $1100,x : sbc #0
; If zp_ptr is in a different LP, we need to map that one now
cmp #$10 : bcc ismapped
; The right page is not mapped, map it now
tax ; save the high byte of zp_ptr for now
and #$f0 ; extract LP number
bpl a15clear : ora #2 ; set bit 1 if bit 7 is set (addr needs to be 1,A14,A13,A12,0,0,A15,RWB,PID7,PID6,...)
a15clear:
ora #$80 ; set bit 7
sta zp_ptr2+1
lda zp_prevprocess
sta zp_ptr2
lda (zp_ptr2) ; Read the process's write mapping for the LP that zp_ptr is in
sta $9100 ; Apply it to our LP 1 (read)
txa ; Restore zp_ptr's high byte
ismapped:
; The right page is mapped now - set zp_ptr's LP portion to 1
ora #$10 : sta zp_ptr+1
; Read the value after BRK from user memory via LP 1
lda (zp_ptr)
; Double it and jump via the jump table
asl
tax
cpx #syscalljumptablesize ; but not if it's too large
bcs badsyscall
jmp (syscalljumptable,x)
badsyscall:
jmp error_badsyscall
.)
Having adjusted the X register to point four entries higher than SP - so it points at PCL - we can then read the return address. It points at the instruction after the BRK, and we want the operand instead, so we subtract one from that address while copying it to zp_ptr.
Now if that address is also in the user process's LP 0 (i.e., it's less than $1000) then we're fine, we already have that mapped from the kernel's LP 1, so the kernel can read that byte, it just needs to adjust zp_ptr to point into LP1 instead of LP0 (adding $1000). But if the BRK instruction hadn't been in the user process's LP 0, then the kernel's current LP 1 mapping wouldn't work for it because it's the wrong physical page - so the code above would then look up what the user process's mapping actually is for whatever LP contains the address PC-1, and it would then remap the kernel's LP1 to point to that physical page instead of pointing to the user process's first physical page. After that it can then adjust zp_ptr to point into LP1 as before, and read the byte via LP1.
Note that the pagetable addresses are a bit swizzled - as the comment in the code says, whereas a user process would refer to a page like this:
Code: Select all
A15 A14 A13 A12 x x x x x x x x x x x x
Code: Select all
1 A14 A13 A12 0 0 A15 RWB PID7 PID6 PID5 ... PID0