6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Sat May 11, 2024 7:47 am

All times are UTC




Post new topic Reply to topic  [ 68 posts ]  Go to page 1, 2, 3, 4, 5  Next
Author Message
PostPosted: Mon Dec 04, 2023 2:13 am 
Offline

Joined: Fri Jul 09, 2021 10:12 pm
Posts: 741
After posting that simple glue logic for a two-mode system last week, my mind keeps being drawn back to more fully-fledged multitasking, and as I couldn't do anything practical this weekend, I fleshed out my ideas a bit more on paper. This is a bit like the one from last week, but with a process ID register, a page table, and thus a completely virtualised memory map for user processes. It is quite a bit more complex than the last one, but still I think fairly simple given what it (hopefully) achieves.

With appropriate kernel code, and a bit of missing glue logic, this design should support:

  • Up to 255 user processes
  • Clean 64K address space for each process
  • 1MB total physical memory shared between processes, mapped in pages of 4K at a time
  • NMI timer-based pre-emptive multitasking
  • Interrupts and I/O handled by the kernel
  • BRK-based system call interface
  • Flexible I/O decoding - the schematic shows an ACIA, a VIA, and an SD card interface, but there's plenty of address space to decode others

A lot of the specifics of operation would depend on the choices in the kernel code, but my plan is for new processes to be spawned with enough memory mapped to hold their binary code, loaded from SD card or serial connection. They'd perhaps load at $0000 and execute at $0100, allowing initialization of page zero and placing a JMP instruction or other init code at the bottom of the stack. Possibly a page of standard helper routines would also be mapped at $F000, but that probably only makes sense if I add support for it to be read-only.

The process can request more memory using system calls, as well mapping/unmapping pages if it needs to page through more than 64K, and of course any other process management facilities and I/O would be accessed through system calls as well. Inter-process communication could be done through shared memory.

As the pages are 4K in size, they're identified by the first hex digit of the address - the process's first logical page is $0000-$0FFF, its second one is $1000-$1FFF, etc up to $F000-$FFFF. I initially used a 2K page size but changed it to 4K to simplify some of the circuit a bit.

The supervisor mode has a different memory map which is of course forced by the hardware. It's optimised for simple decoding rather than best use of address space. The memory map is as follows:
Code:
    F000-FFFF  ROM
    E000-EFFF  ACIA
    D000-DFFF  VIA
    C000-CFFF  SD card interface
    B000-BFFF  PID (process ID) assignment interface
    A000-AFFF  PT (pagetable) read/write
    0000-7FFF  RAM (paged)
The lower half is mapped in the same way as in user mode, i.e. split into 4K pages which are mapped through the pagetable. There's not much ROM, just for address decoding simplicity, but plenty of RAM to load kernel code into from SD or serial. The kernel has its own process ID, probably either 0 or 255, and it will use some of its logical pages to access user memory when that's required by system calls.

The upper half is mostly fairly standard I/O decoding. The unusual entries there are the process ID interface and the pagetable interface. The process ID is just an 8-bit register which the kernel can write to. It is write-only - reading wouldn't be hard to add but doesn't seem worthwhile - and its main function is to influence the virtual memory paging.

The pagetable is 4K in size, stored in an extra static RAM module. It is accessible (read and write) by the kernel at $A000-$AFFF. This allows the kernel to configure which pages of physical memory are visible to individual processes (including the kernel itself, in its lower 32K of address space). Within the pagetable, the lower 8 bits of address contain the process ID, and the upper 4 bits contain the logical page number. In user mode, these are plumbed through all the time; in kernel mode, they are also plumbed through when the lower 32K is accessed, but during pagetable read/write operations, the address and data bus of the pagetable (PTA and PTD in the schematic below) are connected one-to-one to the CPU's address and data buses.

Each location in the pagetable stores 8 bits of data, and these are prefixed to the low 12 address lines from the CPU to form a 20-bit address (hence 1MB) which is used to drive the RAM.

One last subtlety is the transition between modes. Supervisor mode can only be entered by the processor reading the vectors, in response to a BRK system call, IRQ from hardware, or an NMI (used for preemptive multitasking). It is exited just after the opcode fetch of the instruction following any write to the PID register (thanks Proxy for the suggestion of using SYNC for this). This means the kernel can write a new PID then immediately RTI to resume a user process. (Though now that I think, this isn't going to work, as it would need an extra PLA or something in the middle, in order to restore the register that was used to hold the process ID being written to the PID register. I might make it wait for two instructions, or use another operation (e.g. read with BIT) to trigger the mode transition.)

Finally, here's the schematic I have so far. I'm not sure I have the patience to build a breadboard prototype this time as there are just so many buses to wire up, that gets quite tedious - so after giving it some time to settle, I might go straight to PCB layout this time.
Attachment:
File comment: Multitasking 6502 computer schematic
6502multitaskingcomputer-schematic.png
6502multitaskingcomputer-schematic.png [ 146.71 KiB | Viewed 6517 times ]
In the schematic we have:
  • Left side - 6502 (U1), ROM (U16)
  • Central columns - pagetable management - pagetable (U11), PID register (U8), pagetable address multiplexing (U9, U10), pagetable data interface (U13)
  • Top right - main RAM (U12, U15) and RAM selection logic (U5B, U4B)
  • Bottom centre - supervisor flag (U2A, U2B)
  • Bottom right - ROM and I/O address decoding (U3, U4A, U5A, U6A)

Interesting signals in the schematic:
  • D[0..7], A[0..15] - CPU buses
  • PTD[0..7], PTA[0..11] - Pagetable buses
  • PTCS, PTOE, PTWE - read/write the pagetable
  • PIDCS, PIDWE - write to the process ID register
  • SUPER - supervisor mode flag
  • ROMCS, ACIACS, VIACS, SDCS - ROM and I/O selection signals

Some points of note:
  • The address mapping going into the PTA bus - there are two mappings, if we're reading/writing the pagetable then we pass A[0..11], otherwise we pass PID[0..7] and A[12..15]. U10 selects between the top four bits, while U8 or U9 provides the bottom 8 bits.
  • The pagetable's OE signal needs to be asserted at all times other than pagetable write operations (U6A)
  • The RAM is selected when not in SUPER mode, and also when accessing the bottom half of the address space (U5B)
  • U2A and U2B (super mode flags) are both set on VPB. U2A is cleared when the PID register is written. U2B copies this state at the end of the next instruction fetch cycle (using SYNC).
  • The inverters can probably be implemented using spare gates from elsewhere.
  • The mechanism to drive NMI isn't shown, it will just be a countdown timer, probably reset while in supervisor mode to prevent NMIs in that mode.
  • RESET, IRQ, and PHI2 need to be driven in the usual ways - I'm not anticipating any clock stretching here as the large RAM modules are only 55ns, so can't go above about 10MHz anyway.
  • U13 is the wrong way around, it need its A and B pins swapped - oops!

I think compared to similar systems, my understanding is that Andre has also implemented a system like this using normal RAM as the pagetable, but I think I've only seen schematics for his older system using a more specialized IC for the pagetable (I think it was dual port?). I also read BDD's thoughts on this sort of thing in his POC thread, but I think I need a different approach here as for the 6502 it doesn't make as much sense to try to mesh with features of the CPU itself, as it doesn't internally have things like banking at all.

As always, any questions or corrections are much appreciated, as well as other pointers to similar systems!


Top
 Profile  
Reply with quote  
PostPosted: Mon Dec 04, 2023 5:15 am 
Offline
User avatar

Joined: Fri Aug 03, 2018 8:52 am
Posts: 746
Location: Germany
Honestly i'm kinda shocked how little actual logic logic this design seems to require.
damn you people, for making me want to design something like this myself seeing how all of this should easily fit in a CPLD and some faster Memory ICs! :lol:

anyways, looking at the schematic only thing i've really noticed that U9 is a '245 but set to permanently output to the same side, at that point wouldn't function the same as a '244 or similar uni-directional tri-state buffer?

also when you make the PCB, i can recommend Freerouting in case you feel dread when thinking about routing this design manually.


Top
 Profile  
Reply with quote  
PostPosted: Mon Dec 04, 2023 7:45 am 
Offline

Joined: Fri Jul 09, 2021 10:12 pm
Posts: 741
Proxy wrote:
Honestly i'm kinda shocked how little actual logic logic this design seems to require.
damn you people, for making me want to design something like this myself seeing how all of this should easily fit in a CPLD and some faster Memory ICs! :lol:
I find it takes quite a lot of effort and compromises to reduce the logic so much. For a long time I was trying to keep 32K of ROM in the supervisor memory map, which made decoding RAM accesses much harder. I had to remember my previous goal of simple decoding rather than efficient use of address space, and even then I resisted quite a bit.

The use of one '139 to generate RAMWE and PIDWE was maybe a bit too much corner-cutting, and will probably need to change if I implement read only pages.

I spent quite a while using 2K pages, with 7-bit process IDs and the super flag in bit 7, but switching to 4K pages simplified a lot of things. I also couldn't make up my mind whether the kernel's default RAM (at $0000) should be the main paged RAM or shared with the pagetable IC. I think it is simpler and more powerful the way I have done it here in the end.

If the page table grew then it'd need rearranging again. I'm reminded of Michael's past advice regarding use of comparators to overlay I/O more flexibly, and that might make sense here. I tend to instinctively limit myself to components I've used before though and don't think of these things until it's too late.

Quote:
anyways, looking at the schematic only thing i've really noticed that U9 is a '245 but set to permanently output to the same side, at that point wouldn't function the same as a '244 or similar uni-directional tri-state buffer?
Yes, I think that would be fine, I don't have any of those though so don't think about using them!
Quote:
also when you make the PCB, i can recommend Freerouting in case you feel dread when thinking about routing this design manually.
I actually enjoy manual routing too much to use an autorouter! We'll see how it goes.


Top
 Profile  
Reply with quote  
PostPosted: Mon Dec 04, 2023 8:59 am 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10800
Location: England
For the write-to-PID-and-return-to-user-mode puzzle, perhaps you could use the bottom byte of the address bus as the PID? In this case, no need to have the PID value in a register. And indeed, you don't need a write to the PID area - a read would be enough, which means you could even jump into the PID area, and have it read as an RTI.


Top
 Profile  
Reply with quote  
PostPosted: Mon Dec 04, 2023 1:32 pm 
Offline

Joined: Fri Jul 09, 2021 10:12 pm
Posts: 741
BigEd wrote:
For the write-to-PID-and-return-to-user-mode puzzle, perhaps you could use the bottom byte of the address bus as the PID? In this case, no need to have the PID value in a register. And indeed, you don't need a write to the PID area - a read would be enough, which means you could even jump into the PID area, and have it read as an RTI.

Trapping reads wouldn't require a lot of changes, though for what you're suggesting we should really only trap SYNC reads, I think. It is an interesting method. It would need some form of indirect jump, to pick an address without using registers - or maybe use the stack and RTS to jump into the PID setting address range?


Top
 Profile  
Reply with quote  
PostPosted: Mon Dec 04, 2023 1:49 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10800
Location: England
Hmm, could write a JMP into RAM and use that perhaps? Or indeed just use JMP indirect, with the address in question being in RAM, which saves a byte!

It's true that an RTI also reads the next address, so yes, SYNC would be needed. Pity!


Top
 Profile  
Reply with quote  
PostPosted: Mon Dec 04, 2023 2:35 pm 
Offline
User avatar

Joined: Fri Dec 11, 2009 3:50 pm
Posts: 3354
Location: Ontario, Canada
There's one aspect of the proposed scheme that makes me squirm, and that's the penalty on clock rate that results from the Page Table RAM. The clock needs to be slow enough to allow the PT RAM to read a value before the main RAM can even begin doing its own read or write. Not that that's a deal breaker... the Page Table also comes with an up side, and as hobbyists we have the luxurious freedom to ignore or prioritize such issues utterly at our own whim! :mrgreen:

That said, and for what it's worth, here are some suggestions to minimize the impact on clock rate (beyond the obvious remark that it'll be best to use the fastest available RAM both for main memory and the PT).

The PT RAM takes its address input from two different sources, and, in the arrangement shown, that tri-state "multiplexing" will impose yet another 10 (or so) ns penalty, almost as if you had *three* levels of RAM chained end to end. :shock: That's a non-trivial impact. I would consider replacing your tri-state "mux" arrangement with an actual multiplexer... specifically, a multiplexer that's based on FET switches, as this type has essentially zero prop delay in the signal path. More on that subject in this post. You could choose a single FET mux that's 12 bits wide, or three 4-bit units. (The latter are drop-in replacements, pin-wise, for 74_257.)

Full disclosure: the FET switch mux's do not have zero delay in the control path. So, if the Select input changes at the beginning of a cycle, that cycle will need a few extra ns. But such cycles will be very much in the minority, and we do have simple clock-stretcher circuits available... 8)


Speaking of variable cycle times, another significant speedup would result from hurrying through the "dead" cycles (during which the 65xx ignores the data returned from memory). A WDC 'C02 can probably run such a cycle 3 or 4 times more quickly than a cycle that's paced to match two successive RAM accesses, and dead cycles are common enough that the net, overall speedup would be significant. But we need to ask, how can we identify the dead cycles?

It'd be possible to identify many of the dead cycles simply by detecting one-byte opcodes (whose 2nd cycle is always dead), and most one-byte opcodes (exceptions: BRK RTI RTS) can be identified simply because they reside in columns 8 or 9 of the 'C02 opcode map (and no multi-byte opcodes are there). So, a scheme that watches the opcode and SYNC would be pretty easy, but dead cycles associated with indexing, RMW etc wouldn't be detected.

An alternative solution.. and I cringe to mention this, because it changes the flavor of the project... is to replace the 'C02 with an '816 running in Emulation Mode! The downside is, you'll have to either sidestep (or, better yet, make use of!) the Bank Address that's output during PHI2 low. But there are various upsides.

The list of upsides begins with easy, 100% detection of dead cycles (by monitoring VPA and VDA), and VDA and VPA probably have more utility than simply detecting dead cycles. Other hardware features (off the top of my head) include the /ABORT pin (more or less a third interrupt input, albeit one with odd but trivially accommodated timing requirements).

And, software-wise, the '816 has a lot more opcodes than the 'C02, and it's surprising how useful some of these can be, even in Emulation Mode! Examples include TYX, TXY, BRL (Branch Long) and XBA (exchange the TWO ACCUMULATORS). :shock: This is not a complete list.

-- Jeff

_________________
In 1988 my 65C02 got six new registers and 44 new full-speed instructions!
https://laughtonelectronics.com/Arcana/ ... mmary.html


Top
 Profile  
Reply with quote  
PostPosted: Mon Dec 04, 2023 11:01 pm 
Offline

Joined: Fri Jul 09, 2021 10:12 pm
Posts: 741
BigEd wrote:
Hmm, could write a JMP into RAM and use that perhaps? Or indeed just use JMP indirect, with the address in question being in RAM, which saves a byte!

It's true that an RTI also reads the next address, so yes, SYNC would be needed. Pity!

The JMP in RAM is tricky because if that RAM is paged then changing the PID will change which RAM is visible there. Unless the JMP is in the user's memory, but I'd like to avoid that. I'm not sure if I mentioned it above but due to the fact that in super mode the top half of the address space is used for other things, the kernel actually only has 8 logical pages (0,1,2,...,7). Pages 8-F will never be used as $8000-$FFFF are mapped to other things. This means that there are eight bytes in the pagetable that can be used for general storage regardless of PID. They are a bit scattered - it's the first byte in each page from $08 to $0F. But they may be handy for something.

I think I have a simple and compact way to implement the trigger-on-read-PID behaviour though so I will probably stick with using something like a BIT instruction rather than executing code as the trigger.


Top
 Profile  
Reply with quote  
PostPosted: Mon Dec 04, 2023 11:50 pm 
Offline

Joined: Fri Jul 09, 2021 10:12 pm
Posts: 741
Dr Jefyll wrote:
There's one aspect of the proposed scheme that makes me squirm, and that's the penalty on clock rate that results from the Page Table RAM. The clock needs to be slow enough to allow the PT RAM to read a value before the main RAM can even begin doing its own read or write. Not that that's a deal breaker... the Page Table also comes with an up side, and as hobbyists we have the luxurious freedom to ignore or prioritize such issues utterly at our own whim! :mrgreen:
Yes it is a shame given this being the opposite direction to what my other recent designs were all about - which was running from RAM as quickly as possible. At least in PDIP form, the large static RAM modules aren't available in fast speeds, so it will never go super fast - however, I think 8MHz should be easily achievable with a symmetric clock. I think the constraint will actually be the main system RAM rather than the pagetable, in that scenario. My back-of-napkin maths is that I need at least 55ns for a read/write to complete, and of course everything needs to be ready before that point. The pagetable RAM only requires 12ns-15ns to supply a new physical page address - and only actually sees an address change when a page boundary is crossed. The CPU will supply a valid address by 15ns-20ns at the latest after PHI2 goes low. So the time requirement overall is maybe about 90ns, plus some overheads, which means around 10MHz could work - however if we assume PHI2 is used to gate the RAM writes, and we want a symmetric clock, it's really the 55ns figure that limits everything, hence maybe 8MHz or 9MHz being the limit. The pagetable lookup occurs in the less constrained low phase, so is actually free overall!

If we were to get creative, we could change the duty cycle, maybe shortening the low phase, especially if the PTA input hasn't changed since the previous cycle. In my Fast PDIP design, I believe the limitation for the low phase was the time taken for the CPU to present a valid address plus very coarse decoding (is A15 high or low?). Probing the breadboard prototype I could see address lines changing up to 20ns later than the falling edge of PHI2, which is consistent with the observed clock speed limit there of about 25MHz. The PCB version can go faster than that, but I'm not sure I specifically measured this latency there.

For a system like this multiprocess one to go that fast, I think you'd need to do away with the pages and just give each processor the full 64K of its own memory to use. Then so long as no context switches take place, the system can know based only on A15 which of the two 32K modules to activate, and it should be able to run at the same speeds my other systems, and plasmo's, achieve. You could even support a large number of small processes, to some extent, but not as dynamically - you'd need to guarantee that all of each process's memory lives in very easily-determined locations. Sharing pages between processes would be almost impossible, and it would need a lot more RAM ICs, at least the way I see it in my head.

Quote:
The PT RAM takes its address input from two different sources, and, in the arrangement shown, that tri-state "multiplexing" will impose yet another 10 (or so) ns penalty, almost as if you had *three* levels of RAM chained end to end. :shock: That's a non-trivial impact. I would consider replacing your tri-state "mux" arrangement with an actual multiplexer... specifically, a multiplexer that's based on FET switches, as this type has essentially zero prop delay in the signal path. More on that subject in this post. You could choose a single FET mux that's 12 bits wide, or three 4-bit units. (The latter are drop-in replacements, pin-wise, for 74_257.)

Full disclosure: the FET switch mux's do not have zero delay in the control path. So, if the Select input changes at the beginning of a cycle, that cycle will need a few extra ns. But such cycles will be very much in the minority, and we do have simple clock-stretcher circuits available... 8)
Yes that's true, I should add that 10ns to the napkin maths. :) I'll look up the multiplexers. I still think it won't have much impact on the clock speed if I'm keeping to the slow main RAM, which I'm inclined to do because it's hard to find anything faster in PDIP that's readily available. There are probably a lot of faster SMD options though.

Quote:
Speaking of variable cycle times, another significant speedup would result from hurrying through the "dead" cycles (during which the 65xx ignores the data returned from memory). A WDC 'C02 can probably run such a cycle 3 or 4 times more quickly than a cycle that's paced to match two successive RAM accesses, and dead cycles are common enough that the net, overall speedup would be significant.
This is something I was interested in when working more on by Fast PDIP project, but I didn't get around to trying it. I did an analysis of 6502 bus cycle frequencies that seemed to show that internal cycles were not common enough for there to be much gain in optimising them.

Quote:
An alternative solution.. and I cringe to mention this, because it changes the flavor of the project... is to replace the 'C02 with an '816 running in Emulation Mode! The downside is, you'll have to either sidestep (or, better yet, make use of!) the Bank Address that's output during PHI2 low. But there are various upsides.
It is interesting in any case, and the kind of thing BDD was doing for his POC computer, but I don't yet have any experience with the 816 and wouldn't really be confident even guessing about what a good way to arrange things would be. I think you have to have written a fair amount of code in anger on a system before you really understand what it's like to program for, and what hardware features matter most. So for me at least, it is a long way off my radar! I don't dispute though that it has a lot of very appealing-looking enhancements.


Top
 Profile  
Reply with quote  
PostPosted: Tue Dec 05, 2023 2:23 am 
Offline

Joined: Fri Jul 09, 2021 10:12 pm
Posts: 741
I thought about some amendments based on the discussion and bugs in the old circuit. I think I can fix the following fairly easily:

  • Make activating super mode depend on reading PID rather than writing it
  • Make the top bit of the pagetable control writability, so that shared pages can be read-only
  • Add missing NMI-driving circuit

Using the top bit of the pagetable for writability means there are only seven bits left, enough for 128 physical pages; and as each process requires at least one page, that means the effective number of processes is reduced compared to the previous design. The only way around that would be to have a wider pagetable. I seem to remember that Andre's MMU, for example, had 12 bits of storage, which was handy for this (he had more permission bits, things like non-execute pages). It also means we can only address 512K of RAM now, unless other things in the system are changed (larger pages).

The old logic to drive the two RAMCS signals, the RAMWE signal, and the PIDWE signal, was as follows:
Attachment:
multitask-old-139s.png
multitask-old-139s.png [ 12.23 KiB | Viewed 6329 times ]
I can replace that entire IC with a 138 and do it like this instead - sacrificing the two banks, and using the PTD7 signal (high bit of pagetable) to disable writes to pages that are mapped read-only:
Attachment:
multitask-new-138.png
multitask-new-138.png [ 16.62 KiB | Viewed 6329 times ]
Note that RAMCS is active high.

In this scheme, if PTD7 is low then the page is read-only - in that case the '138 is disabled and nothing is selected for read or write. This is only intended to be applied to the paged RAM, but will also affect the PID register as a side effect - so I will need to ensure that PTD7 is set correctly on the logical page that the PID register falls in. That's irritating but not hard to initialise on system startup.

I've had to add a RAMOE signal there that wasn't in the original schematic - this is so that, when writes are inhibited, the RAM does not fall back to driving the data bus, conflicting with the CPU that's trying to execute its write operation.

The new PIDOE signal can now be used to trigger the exit from supervisor mode, with a delay until the next SYNC, just like I did before using PIDWE. The exit code will now look something like this:
Code:
    sta PIDREG  ; set user's process ID
    pla         ; restore user's A value
    bit PIDREG  ; trigger exit from supervisor mode after next instruction fetch
    rti         ; switch to user mode, pop flags and program counter from stack
The corresponding entry code at the start of the interrupt handler can be just:
Code:
    pha         ; push A to user's stack
    stz PIDREG  ; switch paging to kernel's process
with no effect on the SUPER flag, which was already set by the vector pull.

I've also thought a bit about the watchdog-style timer for enforcing premptive multitasking - I think it can be this simple:
Attachment:
multitask-preempt-nmi.png
multitask-preempt-nmi.png [ 12.76 KiB | Viewed 6329 times ]

This will count down from a chosen value and trigger NMI. Entry into supervisor mode will hold the counter at its preset value, preventing NMIs while in supervisor mode. The clock source could be PHI2, but that might interrupt processes too often. It could be SYNC instead, or I could just use an independent clock source like a horizontal or vertical sync pulse, or a slow crystal oscillator.

I was concerned about what would happen if the count expired during an interrupt sequence, before the SUPER flag is set. There's an interesting thread about that here from over a decade ago, with a visual6502 test case that I've adapted here. This in particular executes a BRK instruction, and arranges NMI to briefly go low at a certain point so we can check the effect. NMI is only recognized if it goes low in the second half of the cycle (PHI2 high) - according to the 6502 datasheet I believe it is being sampled at the falling edge of PHI2. So for our purposes here we need to set odd values for "nmi0=" in the URL and the next even value up for "nmi1=" to turn it off again.

The point I've set it in the link above triggers NMI at the latest point where it will redirect the BRK into being an NMI (changing the fetched vector). Any later and the NMI is ignored completely, unless it occurs at the end of the second vector fetch, in which case it is recognized and triggers on the next instruction instead. I don't want this to happen in my case, but that's OK because as soon as the vector fetch begins, the SUPER flag gets set, and the 40103 downcounter will be reset, so the NMI won't occur that late.

There is a problem here though because if the NMI takes over the BRK instruction, the BRK is not executed at all - I believe this is a well-known NMOS bug. This is because the BRK incremented the program counter during its first two cycles. I believe this is fixed in the 65C02, but it highlights that this investigation in visual6502 might not be the same with the newer CPU, so it probably needs some practical testing with hardware in controlled circumstances. I'd love to hear if anybody has already done this.


Top
 Profile  
Reply with quote  
PostPosted: Tue Dec 05, 2023 4:12 am 
Offline

Joined: Fri Jul 09, 2021 10:12 pm
Posts: 741
Ooh I just realised that I could feed RWB into an address line of the page table to allow different mappings for reads and writes, rather than giving up a bit from the physical address range. It will require some shuffling around, but seems a good idea.


Top
 Profile  
Reply with quote  
PostPosted: Tue Dec 05, 2023 4:34 am 
Offline
User avatar

Joined: Fri Aug 03, 2018 8:52 am
Posts: 746
Location: Germany
in terms of faster RAM, you could check out the W24512AK-10. a 64kB SRAM chip in DIP Package and an access time of 10ns. pretty much the fastes DIP SRAM i could find. (there is also the W241024AK, same timings but twice the capacity at 128kB).
I bought mine from aliexpress because they were only 2.50 EUR a piece and luckly they all work. but of course, YMMV.

then for the large SRAM you could go with the IS61C5128AL, 512kB with a 10ns access time. downside is that it's SMT, using the SOJ package. but honestly it's not that bad to solder if you have one of those cheap manual desoldering pumps.


Top
 Profile  
Reply with quote  
PostPosted: Tue Dec 05, 2023 4:51 am 
Offline
User avatar

Joined: Thu May 28, 2009 9:46 pm
Posts: 8178
Location: Midwestern USA
Proxy wrote:
then for the large SRAM you could go with the IS61C5128AL, 512kB with a 10ns access time. downside is that it's SMT, using the SOJ package. but honestly it's not that bad to solder if you have one of those cheap manual desoldering pumps.

Before vision in my left eye went south, I had no trouble manually soldering SOIC and SOJ packages.  Garth has a technique that is similar to what I was using, which produces professional-looking joints.  Here’s a closeup of Garth’s handiwork on an SOJ36 package.

Attachment:
File comment: 512KB SRAM CLose-Up
pcb_sram_close.gif
pcb_sram_close.gif [ 718.56 KiB | Viewed 6314 times ]

_________________
x86?  We ain't got no x86.  We don't NEED no stinking x86!


Top
 Profile  
Reply with quote  
PostPosted: Tue Dec 05, 2023 8:22 am 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10800
Location: England
gfoot wrote:
The JMP in RAM is tricky because if that RAM is paged then changing the PID will change which RAM is visible there.

I see you've moved on with ingenious ideas, but just to backtrack for a second, I don't quite see a problem here: the temporary JMP could be in supervisor space, as it doesn't have any effect until executed, and then it's done with. Changing the RAM mapping on the next sync doesn't hurt (I think!)


Top
 Profile  
Reply with quote  
PostPosted: Tue Dec 05, 2023 12:38 pm 
Offline

Joined: Fri Jul 09, 2021 10:12 pm
Posts: 741
BigEd wrote:
gfoot wrote:
The JMP in RAM is tricky because if that RAM is paged then changing the PID will change which RAM is visible there.

I see you've moved on with ingenious ideas, but just to backtrack for a second, I don't quite see a problem here: the temporary JMP could be in supervisor space, as it doesn't have any effect until executed, and then it's done with. Changing the RAM mapping on the next sync doesn't hurt (I think!)

Ah yes I think it would work if I made it apply at the start of the next sync - I initially thought it wouldn't though because currently it applies it at the end of the sync, which would be after reading the JMP opcode but before reading the operand.


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 68 posts ]  Go to page 1, 2, 3, 4, 5  Next

All times are UTC


Who is online

Users browsing this forum: No registered users and 4 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: