6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Sun Nov 24, 2024 6:54 pm

All times are UTC




Post new topic Reply to topic  [ 544 posts ]  Go to page Previous  1, 2, 3, 4, 5, 6, 7 ... 37  Next
Author Message
 Post subject:
PostPosted: Sun Dec 26, 2010 5:02 pm 
Offline

Joined: Sat Jan 04, 2003 10:03 pm
Posts: 1706
GARTHWILSON wrote:
What I mean is that it would normally be non-preëmptive, coöperative multitasking, but that if a task fails and does not give control back, the timer would time out, something that won't ever happen if everything is working as it should, and correct the problem. When all is working as intended, the timer keeps getting reset by the beginning of a task before it ever times out.


That's still preemptive multitasking.

By definition, preemptive multitasking is when you don't need to rely on cooperation between threads. How this is achieved is an implementation detail. Moreover, absolutely nothing with preemptive multitasking encourages CPU time hogging. All preemptive multitasking ensures is that such hogging doesn't bring the system to its knees. You'll notice that software written for AmigaOS, Linux, and now-a-days even Windows, never, ever use busy-waiting loops. They always go through OS function calls to wait on events to occur. This allows the OS to reschedule the tasks as needed. This is the same as invoking PAUSE in Forth.

I refer you again to AmigaOS, an easy system to study, where the overwhelming majority of software spends its time waiting for I/O, for the user, or on other tasks. OS functionality eagerly attempts to reschedule tasks. If you were to reprogram the CIA's timer to never interrupt the kernel, you'd stand a good chance of still being able to use the system as long as all the tasks you run behave themselves.

The simplest possible preemptive system you can make, in fact, is a cooperative task switcher which has a timer interrupt that invokes PAUSE periodically. Changing the ISR to check for liveness isn't that hard of an enhancement over the minimal requirement.


Top
 Profile  
Reply with quote  
 Post subject: POC V2
PostPosted: Mon Dec 27, 2010 5:28 pm 
Offline
User avatar

Joined: Thu May 28, 2009 9:46 pm
Posts: 8514
Location: Midwestern USA
GARTHWILSON wrote:
Quote:
How would you keep user space from touching the kernel or I/O hardware?

Suit yourself of course, but in my real-time work, the user program absolutely must have direct and immediate access to the hardware.

I'm thinking a Unix-ish operating system, which would be preemptive and not at all real-time in nature. Allowing user space to touch the hardware in such an environment would put the system at risk of fatality.

Quote:
Non-preemptive multitasking allows a very fast, efficient context switch. It was argued recently (although I can't find it) that if one task has a problem (even if it is not totally crashed), it could bring the whole system to a halt. The idea that comes to mind for that is that NMI (or even a regular IRQ) could be used with a VIA timer, similar to a watchdog timer, and if the timer times out and generates an interrupt, the ISR could take that task out of the rotation and restore the operation of the rest.

What you have just described is preemptive multitasking. The only part that you didn't bring up is the algorithm that would decide when a task switch is to occur.

Incidentally, in POC V1 (and others to follow), the watchdog timer in the Dallas DS1511 real-time clock is programmed to generate a 100 Hz IRQ. In the first version, the IRQ increments a 32 bit software counter and provides time delays for functions that need them. However, the downrange plan is to do process scheduling, which means a jiffy IRQ is necessary.

_________________
x86?  We ain't got no x86.  We don't NEED no stinking x86!


Top
 Profile  
Reply with quote  
 Post subject: Re: POC V2
PostPosted: Mon Dec 27, 2010 5:37 pm 
Offline
User avatar

Joined: Thu May 28, 2009 9:46 pm
Posts: 8514
Location: Midwestern USA
fachat wrote:
Your banked system only has 64k virtual address space (not using the bank byte), right?

Correct. However, in a maximal system (16 MB), that would equate to 256 banks of 48K each for user space (254, actually—a bank is needed for the kernel itself and another for I/O buffers). Assuming one process per bank, one could run a lot of stuff at one time. By way of comparison, my SCO OpenServer box rarely has more than 60-70 entries in the process stack at any given time, and that's with all 12 console screens in use.

Quote:
Before I'd start doing "another" banking scheme, I would also look into some kind of MMU-based system - where MMU can just be a simple lookup table, translating say the upper 4 address bits (A12-15) to eight or more address bits. You can map each 4k block separately with separate pages for each process, have shared" pages, etc.

That's probably more ganularity than I would need.

Quote:
This looks like you plan a more "monolithic" kernel, with all device drivers in the kernel itself. I did a more modular approach (microkernel), where separate processes talked to various bits of hardware.

Yeah, I'm thinking monolithic kernel for now. It's not as though I'm going to have hardware coming and going all the time. :)

Quote:
I had to introduce semaphores to protect shared hardware resources. For example the PET or C64 timer were used by different processes (and I didn't abstract them into some system device), so the programs had to acquire the semaphore before using them. Not sure how you want to separate the access to "shared" hardware.

I would not allow user space processes to touch any hardware. It would all be through kernel calls. Having the ABORT input on the '816 will be a big help in that regard.

_________________
x86?  We ain't got no x86.  We don't NEED no stinking x86!


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Mon Dec 27, 2010 5:46 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10986
Location: England
Just a thought: if you have a single kernel task which is the only one able to touch hardware, you have no need to limit the user-mode tasks to 48k of RAM in their banks. If they call the OS using BRK or COP, they can have a simpler memory map: 64k of RAM, either in every bank or in every bank except 0.

Even better, these user mode tasks don't need any memory protection - they can't see anything which they are not allowed to touch. ABORT not needed. (If Bank 0 is shared (so they get 512bytes, or 2k, or whatever) you can have that memory mirrored throughout, so there are no disallowed bank 0 addresses, just a whole lot of locations which can't be reached by a given user mode task.)


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Mon Dec 27, 2010 6:53 pm 
Offline
User avatar

Joined: Fri Aug 30, 2002 1:09 am
Posts: 8546
Location: Southern California
Quote:
What you have just described is preemptive multitasking. The only part that you didn't bring up is the algorithm that would decide when a task switch is to occur.

The task itself (not the OS) decides when to hand control to the next task (at points where a context switch is the easiest); so it never gets preëmpted unless it takes a left turn into the weeds. If it does, the timer times out (something that does not happen as long as things are working right), and the interrupt basically just kills the task. The OS is mostly only involved in setting up new tasks or taking tasks down. As I imagine it, it could, for example, even tell task #6 when it sets it up, or when it sets up or changes #7, "Every time you're done, just jump to task 7's entry point. You don't need to bother me."


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Tue Dec 28, 2010 5:38 pm 
Offline

Joined: Sat Jan 04, 2003 10:03 pm
Posts: 1706
GARTHWILSON wrote:
The task itself (not the OS) decides when to hand control to the next task (at points where a context switch is the easiest); so it never gets preëmpted unless it takes a left turn into the weeds.


This is equally true of preemptive multitasking too. See my earlier post about never, ever busy-waiting in a multitasking system, even preemptive multitasked systems.

Your watchdog timer approach might take seconds before the timer triggers, but that's too long for good user responsiveness in an open-ended system.

In a closed system, where you and you alone have full control over what tasks are running and when, then you can make cooperative multitasking run rings around preemptive systems. But if you have a system where you don't know what tasks will be run by the customer (like, say, on a desktop computer), then preemptive multitasking (with at least 100Hz interrupt frequency) is the only sane way to go.


Top
 Profile  
Reply with quote  
 Post subject: Re: POC V2
PostPosted: Tue Dec 28, 2010 6:09 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10986
Location: England
fachat wrote:
BigEd wrote:
The nice thing about programmable glue logic is that, with a bit of foresight and a bit of luck, you can build something simple in the first instance and have room for something more complex in the same board at a later point.

Yes that's true. Only that the 65816 does not make it easy to build a simple system.... That's why I started to design my own 6502 extension...

André

I thought this might be an interesting discussion in its own right (not closely tied to BDD's POC plans, so I started a new thread on the merits and difficulties of using 65816 in a system design.

André, I'd be particularly interested to hear your take.

Cheers
Ed


Top
 Profile  
Reply with quote  
 Post subject: POC V2
PostPosted: Tue Dec 28, 2010 6:33 pm 
Offline
User avatar

Joined: Thu May 28, 2009 9:46 pm
Posts: 8514
Location: Midwestern USA
BigEd wrote:
Just a thought: if you have a single kernel task which is the only one able to touch hardware, you have no need to limit the user-mode tasks to 48k of RAM in their banks. If they call the OS using BRK or COP, they can have a simpler memory map: 64k of RAM, either in every bank or in every bank except 0.

Even better, these user mode tasks don't need any memory protection - they can't see anything which they are not allowed to touch. ABORT not needed. (If Bank 0 is shared (so they get 512bytes, or 2k, or whatever) you can have that memory mirrored throughout, so there are no disallowed bank 0 addresses, just a whole lot of locations which can't be reached by a given user mode task.)

Not exactly. In order for user mode processes to make a kernel call via a software interrupt, the MPU vectors will have to be valid no matter whose 64K segment of RAM is currently visible. Hence some common RAM or ROM (more likely the former) to hold the vectors will be required, or some system code will have to be present at the top of each of the 64K segments. That code would have to be write-protected in some fashion. Otherwise, BRK will "break' the system when the MPU tries to jump through an invalid vector.

Also, the MMU must be visible somewhere at all times so that interrupt front ends can remap the system to the kernel space. So, again, some common addresses have to exist regardless of the present location in RAM. Therefore, allotting a full 64K of unrestricted RAM to each process is not realistic.

My principle objection to the "flat" memory model is the requirement that all processes share the same area for stacks and ZP. How can you run a multitude of processes in a setup like that without using an entire Xilinx 9500 just to police memory accesses? We have to accept the fact that the W65C816S is fundamentally an eight bit processor with a 16 bit address bus. The 24 bit addressing and 16 bit registers are kludges that exist to maintain 65x02 compatibility, as is the need to refer ZP, stack and vector accesses to the first 64K in RAM. I'm trying to efface these "features" in my design and am willing to forgo a linear memory model in order to be able to "sandbox" processes and prevent system fatality due to wild memory accesses.

_________________
x86?  We ain't got no x86.  We don't NEED no stinking x86!


Last edited by BigDumbDinosaur on Tue Dec 28, 2010 7:44 pm, edited 1 time in total.

Top
 Profile  
Reply with quote  
 Post subject: Re: POC V2
PostPosted: Tue Dec 28, 2010 6:36 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10986
Location: England
BigDumbDinosaur wrote:
BigEd wrote:
Just a thought: if you have a single kernel task which is the only one able to touch hardware, you have no need to limit the user-mode tasks to 48k of RAM in their banks. If they call the OS using BRK or COP, they can have a simpler memory map: 64k of RAM, either in every bank or in every bank except 0.

Even better, these user mode tasks don't need any memory protection - they can't see anything which they are not allowed to touch. ABORT not needed. (If Bank 0 is shared (so they get 512bytes, or 2k, or whatever) you can have that memory mirrored throughout, so there are no disallowed bank 0 addresses, just a whole lot of locations which can't be reached by a given user mode task.)

Not exactly. In order for user mode processes to make a kernel call via a software interrupt, the MPU vectors will have to be valid no matter whose 64K segment of RAM is currently visible. Hence some common RAM or ROM (more likely the former) to hold the vectors will be required, or some system code will have to be present at the top of each of the 64K segments. That code would have to be write-protected in some fashion. Otherwise, BRK will "break' the system when the MPU tries to jump through an invalid vector.

Also, the MMU must be visible somewhere at all times so that interrupt front ends can remap the system to the kernel space. So, again, some common addresses have to exist regardless of the present location in RAM. Therefore, allotting a full 64K of unrestricted RAM to each process is not realistic.

I was thinking you'd use VP to flip in a different mapping for supervisor mode. The bank 0 in supervisor mode, and the other details of the address map for that mode, don't need to have any particular commonality with the address map for user mode tasks.


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Tue Dec 28, 2010 7:11 pm 
Offline

Joined: Sat Jan 04, 2003 10:03 pm
Posts: 1706
I had to confirm that the VP# signal was asserted in a logical way. It isn't as logical as I would like, but it's at least predictable and can be worked with.

Because the CPU fetches vectors after the PBR, PC, and P registers have been pushed onto the stack, VP# is asserted after the user task state has been saved. This state is saved on the user-mode stack.

This isn't necessarily an issue if your kernel code shares the stack with the user code (a logical design choice if BRK or COP is used to invoke system services). But, for handling ABORT, this can be a possible cause for live-lock, because if the abort happened while stacking something in user code, you'll put the CPU into an endless loop (as the CPU attempts to stack the process state after receiving ABORT, it'll receive another ABORT) without some kind of "double fault" detection hardware.

This is a good use for the NMI input, frankly. But it is something to be aware of.

(Sorry -- am I showing my QA Engineering side too much again? ;) )


Top
 Profile  
Reply with quote  
 Post subject: Re: POC V2
PostPosted: Tue Dec 28, 2010 8:46 pm 
Offline
User avatar

Joined: Thu May 28, 2009 9:46 pm
Posts: 8514
Location: Midwestern USA
BigEd wrote:
I was thinking you'd use VP to flip in a different mapping for supervisor mode. The bank 0 in supervisor mode, and the other details of the address map for that mode, don't need to have any particular commonality with the address map for user mode tasks.

That was something I had considered. However:

kc5tja wrote:
Because the CPU fetches vectors after the PBR, PC, and P registers have been pushed onto the stack, VP# is asserted after the user task state has been saved. This state is saved on the user-mode stack.

The obvious problem, as highlighted by Samuel, is if a kernel call via BRK is being processed, the data (register values) needed to restore the MPU state upon exit will be on the wrong stack. Ditto for processing other interrupts. The machine would go off into deep space, never to be heard from again—at least until the reset button is pressed. :)

kc5tja wrote:
This isn't necessarily an issue if your kernel code shares the stack with the user code (a logical design choice if BRK or COP is used to invoke system services). But, for handling ABORT, this can be a possible cause for live-lock, because if the abort happened while stacking something in user code, you'll put the CPU into an endless loop (as the CPU attempts to stack the process state after receiving ABORT, it'll receive another ABORT) without some kind of "double fault" detection hardware.

Exactly. The only way to avoid this issue is to have individual stacks for each process, with each stack located in RAM that cannot be accessed by any other process—even the kernel.

It should be kept in mind that although the kernel would, upon context switch, load the stack pointer with the correct address for the currently running process and thus initially ensure the process is working with the correct stack, there's nothing to prevent that process from changing the stack pointer to anything, including the address of a different process' stack (ditto for the DP register). Succinctly stated, there are no privileged instructions in the '816, ergo no wired-in method to prevent a process from trying to access arbitrary locations in RAM. It would take very sophisticated hardware management to prevent something like that from occurring.

kc5tja wrote:
This is a good use for the NMI input, frankly. But it is something to be aware of.

Using NMI wouldn't help with memory protection, as the NMI isn't processed until the completion of the current instruction. By then, the illegal memory access will have taken place.

kc5tja wrote:
(Sorry -- am I showing my QA Engineering side too much again? Wink )

What's the old saying: quality is engineered in, not inspected in?

_________________
x86?  We ain't got no x86.  We don't NEED no stinking x86!


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Tue Dec 28, 2010 8:56 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10986
Location: England
I see now that there is a subtlety in returning from what I'm calling supervisor mode: the only mode with hardware in the memory maps, and a mode not subject to any per-task mapping restrictions.

So, we see that the user-mode stack gets the state from the user task, then the vectors are accessed, and in those access cycles the CPLD begins to apply the supervisory mappings. At the end of the interrupt, or the OS call handler, we'd like to do an RTI making use of the (particular) user task's stack, and we'd like also to cancel the CPLD's supervisory mode mapping. Would it be easiest to do that by setting the stack pointer, poking the CPLD with an appropriate command and then executing RTI? All that's needed is careful timing: the RTI should be fetched from the supervisory map, but should read the stacked data from the user task map.

I'm pretty sure we've seen a system elsewhere which has, and solves, this question of careful RTI.


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Tue Dec 28, 2010 10:40 pm 
Offline

Joined: Tue Jul 05, 2005 7:08 pm
Posts: 1043
Location: near Heidelberg, Germany
You could monitor the opcodes, and upon RTI decrease a counter in the CPLD to enter user mode when zero is reached. Each interrupt would have to increase the counter (protected by ML) at the beginning of the interrupt to accommodate for stacked irqs

André


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Tue Dec 28, 2010 10:49 pm 
Offline
User avatar

Joined: Thu May 28, 2009 9:46 pm
Posts: 8514
Location: Midwestern USA
fachat wrote:
You could monitor the opcodes, and upon RTI decrease a counter in the CPLD to enter user mode when zero is reached. Each interrupt would have to increase the counter (protected by ML) at the beginning of the interrupt to accommodate for stacked irqs

It may be that the CPLD (MMU?) counter could be automatically incremented by monitoring VP, since that signal is asserted when the MPU is ready to load the PC with the vector address. Then your idea of detecting when RTI is executed would take care of decrementing the counter. There is one slight hitch in that intentional "misuse" of RTI to redirect the MPU (along the same line as pushing an address to the stack and then executing RTS to go to that location) would improperly decrement the counter.

_________________
x86?  We ain't got no x86.  We don't NEED no stinking x86!


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Tue Dec 28, 2010 11:37 pm 
Offline

Joined: Sat Jan 04, 2003 10:03 pm
Posts: 1706
Since much difficulty seems to be originating from the strong desire to have both user- and supervisor-modes of operation, perhaps a better approach would be to eliminate these modes all-together.

I'd like to re-introduce the notion of a segment. I envision an MMU with segments for direct-page (DS), stack (SS), application code (ACS), application data (ADS), system code (SCS). Note that no system data segment exists, because it's not necessary.

The MMU would assert -ABORT under the following conditions:

If the CPU attempts to fetch any opcode or operand from any address not contained within ACS or SCS, assert -ABORT (attempted execution of data).

If the CPU attempts to read or write to any address not contained in the ADS, SS, or DS, assert -ABORT (unauthorized attempt to access another process' address space)

If the previously fetched opcode occurred from ACS, the processor is now attempting to fetch from SCS, and the previous opcode fetched was neither BRK nor COP (or some other MMU-recognized gate), assert -ABORT (attempted direct-execution of privileged code; nice try there buddy!).

Protection of the MMU's registers occurs through the above protection rules. If the MMU's register set never appears in ADS, SS, or DS, the registers can never be changed by unprivileged code.

Privileged code, then, is identified by where it exists in the address map, not by the establishment of any special modes. If the CPU is granted access to memory defined by SCS, then it has the freedom to do whatever it wants, whenever it wants. This includes, of course, updating the MMU registers.

The advantage of this style of memory management should be clear: it can work independently of whether you have multiple address spaces or not, and it works equally well in either case. You do have the disadvantage of perhaps thrashing the ADS segment settings if you maintain a large number of independent memory regions, but that overhead can be amortized by using multiple ADS segments. (Note: if using a 65816, you can overlap the stack and direct page, so you can use SS for both purposes. This frees up DS for use as a general-purpose data segment like any other.)


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 544 posts ]  Go to page Previous  1, 2, 3, 4, 5, 6, 7 ... 37  Next

All times are UTC


Who is online

Users browsing this forum: No registered users and 20 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: