BigDumbDinosaur wrote:
I think you may have things a little confused here.
A “bank aligned” program would load to $0000 in the chosen bank, that bank being anything other than bank $00.
the name was probably poorly choosen but i don't mean that the program HAS to start aligned with a bank boundary, just that it has to be placed at a specific address within any bank, like $8000, $1700, etc.
BigDumbDinosaur wrote:
The choice of bank would have to be decided by the part of your kernel that loads and executes programs. The bank in which a program is executing has nothing to do with where runtime data is stored. It could be in the same bank as the execution bank—which I’ll describe below, but you are not bank-limited with data.
yes i'd thought it would make sense to use seperate banks for code and data (+bss) to allow for larger programs without having the program itself split between multiple banks. the Kernel could for example count the amount of useable memory banks and then use the first half for code, and the second half for data.
BigDumbDinosaur wrote:
Incidentally, bank $00 is too valuable to be occupied by user programs. You need that space for direct page(s), stacks, I/O hardware, and enough ROM to get the machine running to the point when it can load a kernel from mass storage (you really don't want to run the kernel from ROM unless absolutely necessary). If at all possible, you want I/O in the same bank as the kernel to avoid the penalty of using long addressing and attending loss of flexibility in your device drivers. Furthermore, if your system is fast enough, you will have to wait-state I/O (and ROM), which if combined with long addressing, will definitely slow down your system.
well my SBC has IO in Bank 0 and the main ROM in Bank 1, plus i plan on putting the Kernel code in Bank 1 as well...
hmm i might move IO to Bank 1, which would also free up some RAM space in Bank 0, increasing the total amount to 63.75kB.
also i would still have to use long addressing modes when accessing IO since the Data Bank is set to wherever the calling process is located. and cahnging it would be slower than just using long addressing, even when moving large chunks of data as you likely have to move them from or to the Process' Data chunk anyways so might as well leave the Data Bank there and just use long addressing for IO.
So i don't think there is an easy way around the long addressing penalty without making it even slower by messing with the Data Bank Register.
BigDumbDinosaur wrote:
Assuming your system has a sane memory map and your operating system doesn't develop a bad case of creeping featurism, you should be able to run the kernel in bank $00 RAM as low in memory as you can place it, keeping in mind that the kernel’s direct page doesn’t have to be at the physical zero page—although the kernel's DP should start on a page boundary for best performance, and the stack can be anywhere convenient—there is no page alignment issue with the stack.
In planning this, you also have to account for the direct page and stack requirements of user-land programs that are to be loaded, which programs would be running in extended RAM (RAM starting at $010000), but whose direct pages and stacks would be competing for bank $00 space. The task switcher in your kernel would have to determine how to manage direct pages and stacks to avoid collisions.
now you got me curious, how would you define an Insane Memory Map?
anyways, my idea for DP and the Stack is rather simple. each process gets assigned it's own DP somewhere in the first 4kB of RAM (which is enough for 14 processes + 2 for the Kernel). and the remaining ~60kB get split into 4kB chunks for the stacks (enough for 14 processes + ~1 for the Kernel).
4kB of Stack should be plenty even for C Programs if they're smart about passing parameters (ie using pointers for large structs and such)
BigDumbDinosaur wrote:
Since user-land programs can/should be loadable to any available bank, your kernel API front end has to either be called with JSL or a via a software interrupt (aka a kernel “trap”)—I use the trap method (COP) for APIs. JSL/RTL requires that every program know the location of the kernel’s API jump table. If you later discover that you need to relocate the kernel, you will have to reassemble every program to recognize the new location of the API jump table. These headaches are avoided when a kernel trap is used to invoke an API service, since the only thing a program needs to know is the API service’s index and parameter requirements.
yes, API calls through COP or BRK seem like the best option, i remember there being a thread about API calls and such:
viewtopic.php?f=2&t=5434i'll give it a thorough read later.
BigDumbDinosaur wrote:
In laying out how this would work, it’s useful to think in terms of how a C program is structured in RAM: text, data & BSS (uninitialized working storage). In a 65C816, the parts that would be loaded into the same bank would be text (executable machine code) and data, e.g., data tables, and numeric and string constants used by the running program. Loading static data into the same bank as the program text means it can be accessed with 16-bit addressing, which will improve execution speed (absolute long addressing costs an extra cycle per access and is limited to X-indexing). To facilitate this, the sequence PHK - PLB would be executed at program startup. Furthermore, if static data is in the same bank, runtime pointers can be generated on the stack with PER, one of the keys to building a position-independent program (also PEI is useful in that respect).
i'm a bit confused, why would there be a speed benifit from having the data in the same bank as the code? they use seperate Bank Registers so you can place static (or even bss) data in a different bank and still be able to use 16-bit addresses to access it.
also i would say only the kernel should be allowed to change the program/data bank registers, with processes never touching either register (except reading them out maybe using PHB/PHK).
but i do like the idea of being able to use PER.
BigDumbDinosaur wrote:
BSS is a little more complicated. If the amount of runtime data that will be processed by the program isn’t too large, BSS can be defined in the unused RAM that follows the static data area, which again permits 16-bit addressing. Here again, PER can help with relocation matters. Also, indirect addressing through direct page can be done with word-sized pointers. Code will be somewhat smaller and faster than if BSS originates in a different bank, since the latter will necessitate long or indirect long addressing, with 24-bit direct page pointers in the latter case.
If the anticipated runtime data will exceed the space available after the end of static data, you will have to plan for bank-agnostic addressing, which means indirect long. This also means your kernel has to apportion uninitialized RAM to programs as needed.
honestly i would try to keep things simple and avoid having to deal with multiple banks of data. If bss and static data don't fit into 1 bank, the program will simply not be loaded and just throw an error. same with code size.
if both code and data can occupy multiple banks then, like you said, you run into the issue of accessing it. the process cannot know where they are located due to potential memory fragmentation. the other banks could be sitting anywhere in memory and only the kernel would know where.
so when a process wants to access data from outside the current data bank there would need to be an API Call to ask the kernel to either change the Data Bank Register (and in addition return a 16-bit pointer), or to return a full 24-bit pointer to where the data is located.
either way it sounds like a pain so i won't be dealing with that for the time being.
BigDumbDinosaur wrote:
Designing software to run anywhere on a 65C816 system is a bit of a challenge, but not too difficult once you break free of 6502 coding methods and treat the 816 as a different beast in native mode. Incidentally, some of this is
discussed here.
I've been getting used to the 65816 pretty well actually. you have to rethink a bunch of stuff but i still think it has the same "6502 feel" to it.
I'll give that link a read, thanks!