6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Sat May 11, 2024 10:06 pm

All times are UTC




Post new topic Reply to topic  [ 68 posts ]  Go to page Previous  1, 2, 3, 4, 5
Author Message
PostPosted: Tue Dec 19, 2023 11:10 am 
Offline

Joined: Fri Jul 09, 2021 10:12 pm
Posts: 741
BigEd wrote:
Just a quick question George - is the schematic and overview in the head post still a good reference? I like your ideas but haven't yet studied the detail or the implementation.

That first one was incomplete, and also lacks some design changes I made later on. The one in this later post is close to what I've actually built, and that post describes the differences to the first design in some detail.

The further changes in what I actually built are fairly simple:
  • Added pull-up resistor to the ACIA's IRQB
  • Added pull-up resistor to /ENDSUPER
  • Add pull-down resistor to ROMDIS
  • Omit the upper 512K RAM bank
  • Replace the lower 512K RAM bank with a 32K module

I'll upgrade the RAM back to the original spec at some point, but aside from that I'd consider that hardware design pretty stable. The only serious hardware bug I'm aware of is that the mechanism to allow code in ROM to write to the private RAM (U16) is not reliable enough to write actual code into that RAM while the ROM is selected, but it's been possible to work around that without too much trouble and I haven't stopped to investigate it more.

The code is all here as well, though without any documentation beyond the comments: https://github.com/gfoot/multitasking65 ... n/src/mtos


Top
 Profile  
Reply with quote  
PostPosted: Tue Dec 19, 2023 11:14 am 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10800
Location: England
Thanks for the pointer!


Top
 Profile  
Reply with quote  
PostPosted: Tue Dec 19, 2023 12:02 pm 
Offline

Joined: Fri Jul 09, 2021 10:12 pm
Posts: 741
gfoot wrote:
It dropped into user mode briefly then back to super mode just before the interrupt - this is likely a system call - and then you can see it spent about 55us in the system call, leading to most of the delay in processing the IRQ. I need to look at getting interrupts enabled during system calls, then measure this again.

I made some changes to allow IRQs during system calls, and it improved these figures. From that last post I was observing about 65us duration of IRQB, and it's now down to about 30us-40us when supervisor mode had been active. It's about 20us when user mode was active, which won't have changed since the previous post. Here's an updated trace showing, top to bottom, VPB, SUPER, IRQB, and the serial transmit line:
Attachment:
20231219_113119.jpg
20231219_113119.jpg [ 861.97 KiB | Viewed 855 times ]
There are three pulses in VPB showing where interrupts or system calls occured. The first is an interrupt taken in user mode - probably the preempt timer expiring. This was serviced in about 20us, but the system then took about 35us to pick a new process (measuring from the rising edge of IRQB to the falling edge of SUPER). Most interrupts don't cause a process switch so avoid some of this cost in their exit code. Interrupts would be masked throughout this one.

Then the system went into user mode and executed a system call via BRK, putting it back in supervisor mode. Shortly after this started, IRQB went low, and - as I now allow interrupts during system calls - this was serviced as soon as interrupts were re-enabled, despite the system still being in supervisor mode - you can see the third pulse on VPB at this point. The IRQ itself cleared about 20us later, consistent with the first one, so overall IRQB was low for about 30us in this case.

I'm sure there's room for improvement in the IRQ response code in general but am glad to see enabling interrupts during syscall is both working and having a positive effect. I think there's potential to enable interrupts during task selection as well. The response times will also reduce if the clock speed goes up, of course - but at 4MHz it looks like the worst case is now about 40us.

These are the only changes needed to enable interrupts during syscalls - pleasantly few! I mark that an interrupt came from supervisor mode by decrementing a memory location to a negative number, which can be done without needing any registers; and it can then also be tested right at the start of the regular interrupt handler routine, without corrupting any registers either. It branches to a dedicaded routine to handle hardware interrupts from supervisor mode. The preempt interrupt is also disabled because we don't want that one to fire - we can't switch processes anyway while we're in a syscall, all we can do is lightweight processing of proper hardware interrupts.
Code:
@@ -26,6 +26,8 @@ initloop:
        stz zp_runqueue_head
        stz zp_runqueue_tail

+       stz var_interrupt_from_supervisor
+
        rts
 .)

@@ -59,12 +61,40 @@ resethandler:
        jmp scheduler_run
 .)

+superirqhandler:
+.(
+       ; Special interrupt service routine for interrupting the supervisor.  We don't need to
+       ; worry as much about user process state, just run the regular hardware interrupt handler
+       ; and then return.
+       ;
+       ; It does need to run as PID 0 though, and it's possible another process was selected
+       ; even though we were in supervisor mode, so remember the old PID before setting it to
+       ; zero.
+       sta var_superirq_saveda
+
+       lda PID : sta var_superirq_saved_pid
+       stz PID
+
+       jsr irqhandler2
+
+       lda var_superirq_saved_pid : sta PID
+
+       lda var_superirq_saveda
+       rti
+.)
+
 irqhandler:
 .(
+       ; Support a nested hardware interrupt within a system call - if we interrupted the
+       ; supervisor then execute a bespoke version of the handler
+       bit var_interrupt_from_supervisor
+       bmi superirqhandler
+
        ; This could be an IRQ or a BRK.  We can check the stack to find out which.
        ; There's no need to be reentrant here, but while the active process is still selected,
        ; we mustn't write to zero page or the stack.
@@ -99,8 +129,21 @@ isbrk:
        ; can be considered - but this is simple and efficient.
        and #4 : bne killit

+       ; We allow interrupts during system calls in general, but we need to mark that this has
+       ; happened so that when the interrupt returns it leaves the system in supervisor mode.
+       ; Also, disable the preempt timer interrupt now because we don't want that one to fire.
+       dec var_interrupt_from_supervisor
+       lda #$40 : sta VIA_IER
+       cli
+
        jsr syscall      ; dispatch the system call

+       ; Remask interrupts, re-enable the premept timer, clear the flag saying any interrupt
+       ; was from supervisor mode
+       sei
+       lda #$c0 : sta VIA_IER
+       inc var_interrupt_from_supervisor
+
        bcc resume       ; resume current process if carry clear

        ; Carry set means the process is blocked, so we need to arrange for the syscall to repeat.


Top
 Profile  
Reply with quote  
PostPosted: Sat Dec 23, 2023 3:38 pm 
Offline

Joined: Fri Jul 09, 2021 10:12 pm
Posts: 741
I haven't had a lot of time recently, but I wanted to add a video output circuit as I was tired of the slow speed of the serial communication. I could increase the baud rate but it still wouldn't get me to where I want it to be, it ends up being too much of a burden on the system and all the user processes end up spending all their time waiting for the serial line.

I won't go into full detail about the video circuit here, I'll post that separately when I get time - but briefly, I wanted something capable of decent text output, without requiring a lot of video RAM, so went for a VGA-style text display, with 80x30 character cells, each 8x16 pixels. To keep it simple at first I also made it black-and-white only, with no character attributes. As I've said before, all my past video circuits were synchronised with the CPU clock - or rather, the CPU clock was generated from the video circuit. It's a pattern I copied from the BBC Micro. It works well but removes a lot of flexibility on the CPU side of things, and I've been meaning to do it differently "next time" - so here next time is. This runs the video circuit at 640x480 VGA frequency, and the main computer is still running at 4MHz. The video circuit captures all writes to paged memory, and especially writes to pages $F0-$F7, and buffers these (1-deep) and writes them into its own private video RAM at the next opportunity. The computer's main RAM is also accepting these writes, and the CPU can also read the data back from there like any other RAM.

Here's a photo of the circuit and test output:
Attachment:
20231223_133819.jpg
20231223_133819.jpg [ 355.15 KiB | Viewed 806 times ]

Attachment:
20231222_061117.jpg
20231222_061117.jpg [ 470.56 KiB | Viewed 806 times ]
Attachment:
20231223_133727.jpg
20231223_133727.jpg [ 1.06 MiB | Viewed 806 times ]

Here the kernel is mapping one of its logical pages to physical page $F0, which the video circuit decodes as video memory, and writing data there as part of its initialisation sequence. I used the chequerboard pattern to check the screen bounds were good and make sure the monitor's calibration worked well, and overlaid a ruler so I could check the overall width was correct.
Code:
videotest:
.(
    ; Map video memory at LP1
    lda #$f0 : sta PT_LP1W : sta PT_LP1R

    stz zp_ptr
    ldx #>LP1 : stx zp_ptr+1

    ldx #30    ; count 30 rows
loop2:
    ldy #0
loop:
    ; convert column number to hex
    tya : and #15
    cmp #10 : bmi skipletter
    adc #6
skipletter:
    adc #48

    cpx #16 : beq is16   ; if this is row 16, display the hex digit
    lda #$b1             ; otherwise display the chequerboard pattern
is16:

    sta (zp_ptr),y
    sta (zp_ptr),y                            ; extra write to work around hardware bug

    iny : cpy #80 : bne loop                  ; 80 columns

    clc : lda zp_ptr : adc #$80 : sta zp_ptr  ; advance to next row (stride = $80)
    lda zp_ptr+1 : adc #0 : sta zp_ptr+1

    dex : bne loop2                           ; stop after 30 rows

    rts
.)

I will probably add a syscall to allow any user process to also map this page, and they can just fight over it - all I want is to use it for more real-time output from the processes, so I can design them to each draw only in one part of the screen. In future I could use more RAM pages to provide multiple framebuffers, with some form of switching capability like virtual consoles in Linux, but the point here for test purposes was to see them all running together so this is what I'll do first.

There is a bug at the moment which leads to some write operations not being performed, or perhaps not being performed correctly. I'm not sure what's causing it, it is hard to observe happening, and I tried various mitigations at various points in the write pipeline without improving matters. Making the code just execute every write operation twice in a row did seem to work around the problem at least - I will have to diagnose it in more depth another time.

I am also pretty sure my shift register is broken - it behaves very strangely, in ways that I have also found workarounds for but which shouldn't be necessary. I mean to also try swapping it for another one but it is trapped under wires at the moment!


Top
 Profile  
Reply with quote  
PostPosted: Sat Dec 23, 2023 4:05 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10800
Location: England
Splendid! Any special care to move data from one clock domain to the other?


Top
 Profile  
Reply with quote  
PostPosted: Sat Dec 23, 2023 4:38 pm 
Offline

Joined: Fri Jul 09, 2021 10:12 pm
Posts: 741
BigEd wrote:
Splendid! Any special care to move data from one clock domain to the other?

No not in this case, I cut as many corners as I could. The main system's RAM /WE line also goes into the '574 CP inputs, so that they capture the address and data bus states at the end of the write operation. Then a PLD clocked by the VGA pixel clock (25.175MHz) samples that plus the high bits of the physical page, watching out for cases where /WE is low and the physical page is $F0-$F7. When it spots the end of such a condition (i.e. it was satisfied on the previous VGA pixel but not the current one), it latches that a write is pending, then uses that latch to conditionally activate the write phase at an appropriate time (when the video system isn't reading from the RAM). It would fail in various ways if that condition was somehow noisy, or activated at an inappropriate time.

I guess the opportunities for metastability are where the page bits may be changing at the same time as the VGA clock rises, or where the VGA clock coincides with the an edge of /WE (which is gated by PHI2 as usual). The page bits are provided by a RAM lookup during phase 1.

A better design may be to use external D flipflops to track whether a write is pending, so that one can be triggered by the edge of /WE and the other can synchronise that to the VGA clock, then the PLD could read from there, one clock cycle later, to get some protection from metastability.


Top
 Profile  
Reply with quote  
PostPosted: Sat Dec 23, 2023 5:21 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10800
Location: England
Thanks… so what you have at present is a kind of testbed which might illustrate how often any kind of metastability problem might turn up…


Top
 Profile  
Reply with quote  
PostPosted: Sun Dec 24, 2023 6:43 am 
Offline

Joined: Mon Jan 19, 2004 12:49 pm
Posts: 684
Location: Potsdam, DE
I'm at the stage of working out the character set for my half-svga display: would you care to share your character rom?

Thanks,

Neil


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 68 posts ]  Go to page Previous  1, 2, 3, 4, 5

All times are UTC


Who is online

Users browsing this forum: No registered users and 5 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: