6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Sun Oct 06, 2024 6:24 am

All times are UTC




Post new topic Reply to topic  [ 10 posts ] 
Author Message
PostPosted: Mon Feb 06, 2006 5:27 pm 
Offline

Joined: Wed Jan 22, 2003 6:54 am
Posts: 56
Location: Estados Unidos
..."Direct Zero Page" (DZP) register? It can!!

Hey! The '816 does it so why can't our little buddy the '02?

I love the zero page. I hate the fact that it is always in the same place. With some ingenuity (not really), this fantasy can become reality.

Imagine, if you will, a latch. A latch that holds the upper 8-bits to be used with zero page accesses. A latch that has it's output enable activated by the high byte of the address being equal to zero. There must needs be a bus transceiver (or whatever) to gate the original high 8 CPU address lines.

The software just writes to this latch when it wants to change the DZP location. Anytime logical page zero is accessed, the physical page number in the latch is used instead. This can make things like 256-byte circular buffers fun. Of course, it may all be entirely superfluous.

For more fun add another latch and drive the output enables based of the high address byte being zero and also the read/~write CPU output. One latch is used for DZP reads, the other for DZP writes. In my thinkery, this dual-latch setup has more dubious benefits. I thought it would mainly be used to help speedup memcpys, but the cycle cost of other addressing modes is the same as zeropage indexed, for example.

I'm surely not the only person to have thought of this, but here it for comments and discussion. What do you think? Could this be useful and for what? Is it unnecessary?

_________________
Thanks for playing.
-- Lord Steve


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Mon Feb 06, 2006 9:09 pm 
Offline

Joined: Thu Jul 01, 2004 12:24 am
Posts: 4
Location: Melbourne, Australia
Cute idea. You could do the same to move the stack around, or make it smaller. Useful for multitasking.

Something else I thought of while considering this sort of thing: is there any way to determine when the CPU is doing an instruction fetch (rather than just normal data)?

If there is, you could record the instruction. If its a stack access (PHA and friends), you could actually a seperate chunk of memory, thus giving a page back to the program.

Take this further, and you could have zero page in a seperate bit of memory too (ie LDA $00 is different to LDA $0000). Not sure if there's value in that.

We're starting to get into MMU territory here. Thanks for your post, its started a whole new topic in my notebook of things to try and implement one day :P


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Mon Feb 06, 2006 10:45 pm 
Offline

Joined: Sat Jan 04, 2003 10:03 pm
Posts: 1706
rob wrote:
Cute idea. You could do the same to move the stack around, or make it smaller. Useful for multitasking.


I would like to point out that, IIRC, the Commodore 128's MMU chip did support moving zero and stack pages around in memory, though I don't think it was used often.

Quote:
Something else I thought of while considering this sort of thing: is there any way to determine when the CPU is doing an instruction fetch (rather than just normal data)?


The _SYNC signal is asserted when fetching an opcode byte. It is not asserted when fetching operand bytes however.

The 65816 has better disclosure of its intentions, as it replaces the _SYNC signal with VPA and VDA signals:

Code:
VDA VPA Operation
-----------------------
 0   0  Internal operation; no valid bus cycle.
 0   1  Fetch from program space: operand bytes
 1   0  Fetch or store from/to data space
 1   1  Fetch from program space: opcode byte
-----------------------


BTW: VDA stands for Valid Data Access, and VPA stands for Valid Program Access.


Top
 Profile  
Reply with quote  
 Post subject: Stack moving
PostPosted: Tue Feb 07, 2006 10:24 pm 
Offline

Joined: Wed Jan 22, 2003 6:54 am
Posts: 56
Location: Estados Unidos
Being able to mov the stack around is an even more awesomer idea. Pushing a register only takes 3 cycles, including incrementing SP. That is, of course, cheaper than storing then incrementing a pointer.

Come on, you guys... more ideas!

_________________
Thanks for playing.
-- Lord Steve


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Tue Feb 07, 2006 10:37 pm 
Offline

Joined: Thu Jul 01, 2004 12:24 am
Posts: 4
Location: Melbourne, Australia
kc5tja wrote:
The _SYNC signal is asserted when fetching an opcode byte. It is not asserted when fetching operand bytes however.


Interesting. So if you swap in/out different bits of memory depending on the opcode. eg a flip-flop that is set when the opcode is PLA and friends, and cleared on everything else, then use its output to drive CS lines on different memories, giving a "private" stack area. Not sure what practical use there is for this though - most of the time a bigger memory chip would yield the same end result (more RAM available).

And I guess you could do paging by watching for a memory-access opcode, then watching for the next two bytes (zeropage & stack as special cases). Then, do the page table lookup and fire a NMI to get the OS to load something useful. Then I think you'd need the ISR to tweak the stack such that after the RTI the op that did the memory access is executed again.

Or you could possibly halt the main processor and have the coprocessor (since we're getting into that kind of complexity) do the work of pulling in the data from disk or whatever.

All of this seems horribly complicated for something that is fairly useless when its so easy to fill the entire address space with actual RAM (unlike a "real" virtual memory scheme, when you rarely have many GB of actual RAM).

Maybe I'll play with it sometime, for the educational value.

Quote:
The 65816 has better disclosure of its intentions, as it replaces the _SYNC signal with VPA and VDA signals:


That would make things quite a bit simpler. The 65816 has a larger address space too, right? A real MMU would make far more sense in this case.


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Wed Feb 08, 2006 12:19 am 
Offline

Joined: Sat Jan 04, 2003 10:03 pm
Posts: 1706
rob wrote:
Interesting. So if you swap in/out different bits of memory depending on the opcode. eg a flip-flop that is set when the opcode is PLA and friends, and cleared on everything else, then use its output to drive CS lines on different memories, giving a "private" stack area.


Yep. Note that multi-byte operations (most 2-byte and 3-byte instructions) only have _SYNC asserted on the first byte; it's negated on the other two bytes. Therefore, your MMU circuitry would need to properly track this.

Quote:
Or you could possibly halt the main processor and have the coprocessor (since we're getting into that kind of complexity) do the work of pulling in the data from disk or whatever.


The early Sun workstations (which used only 68000s) did this, because the 68000 lacked any facility to properly restart an instruction. It wasn't until the 68010 that this capability came about.

Likewise, the 6502 lacks restartability; however, the 65816 exposes an _ABORT pin, which if asserted, causes the current instruction to abort, generates an interrupt with the return address set to the aborted instruction -- thus, it allows proper instruction restarting.

Quote:
That would make things quite a bit simpler. The 65816 has a larger address space too, right? A real MMU would make far more sense in this case.


Yes, the 65816 supports 16MB address space, provides better support for multiple address spaces, and provides for instruction restartability via the _ABORT pin.


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Wed Feb 08, 2006 1:18 am 
Offline
User avatar

Joined: Fri Aug 30, 2002 1:09 am
Posts: 8521
Location: Southern California
Separating data access for "PHA and friends" from that of other instructions would forfeit the 6502 stack-relative addressing methods presented by Blargg and Bruce. The hardware would have to be built up special anyway since the extra latches and so on needed to do the job don't exist on normal 6502 computers; so you might as well go with the 65816 which makes these things so much easier. The '816 doesn't have to be any harder to implement hardwarewise than the '02. I think the high address byte latching scares some people away; but remember you don't have to latch, decode, or use A16-A23 if you don't need more than 64K of address space. The '816 will still give you a ton of benefits over the '02; and if you're often dealing with 16-bit numbers, the '816 is much easier to program.

For multiple stacks for multitasking on the 6502, your software can save and restore a stack pointer for each task. Since there's only a 256-byte space to work with, you'd divide that up into equal segments, giving each task its own segment. No extra hardware is needed. From my experience, you could go with three or four tasks easily enough in Forth, and maybe six with care. A lot of assembly-language applications might be able to go quite a bit higher. Some people have thought the 6502's 256-byte stack space is extremely limited; but remember some microcontrollers only give you 6 (or even fewer!) stack levels, meaning that if you're that many subroutine levels deep and an interrupt hits, you overflow the stack. A lot of them don't let you put anything else on it either.


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Wed Feb 08, 2006 5:49 am 
Offline

Joined: Sat Jan 04, 2003 10:03 pm
Posts: 1706
GARTHWILSON wrote:
but remember some microcontrollers only give you 6 (or even fewer!) stack levels, meaning that if you're that many subroutine levels deep and an interrupt hits, you overflow the stack. A lot of them don't let you put anything else on it either.


The ATmega series uses data RAM for the stack.

However, even so, you don't need multiple stacks to multitask! Remember that multitasking is a user-perception. (Footnote: Contrast this with multithreading, which is not a user perception -- it's an implementation technique to achieve multitasking.) Another approach to achieving multitasking is through event-driven programming, which often out-performs even cooperative multitasking because it does away completely with its task switching overhead.

See http://en.wikipedia.org/wiki/Event-driven_programming for a simple introduction to the concept if you're unfamiliar with it. For a more detailed explanation (only 59 pages) see http://eventdrivenpgm.sourceforge.net/ . To see event driven programming in action, see the uIP TCP/IP stack at http://www.sics.se/~adam/uip/ .


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Thu Feb 09, 2006 9:19 pm 
Offline
User avatar

Joined: Fri Aug 30, 2002 1:09 am
Posts: 8521
Location: Southern California
A related topic got started at Hardware --> "Homebuilt" 6502 cpu's at http://www.6502.org/forum/viewtopic.php?t=44 , daydreaming about new instructions to add to a 6502, and about actually making one's own processor.

Ruud has been working with a few others at making a 32-bit 6502. Recently their progress has been slow due to other demands of life, and the forum for that has been dormant; but I guess the project is still on. I'll apologize again to Ruud for getting his last name wrong in the above-mentioned page. I guess he had signed off with "Groetjes, Ruud" in a post and I took that to be a last-name-first notation. His last name is Baltissen. 'Groetjes' is Dutch for 'Regards'.


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Sat Dec 30, 2006 8:52 pm 
Offline

Joined: Tue Jul 05, 2005 7:08 pm
Posts: 1041
Location: near Heidelberg, Germany
Hi,

I have implemented a real MMU with my CS/A computer http://www.6502.org/users/andre/csa that can remap each 4k separately into a 1M address space. This way the zeropage as well as the stack are moved together "from under the CPU" and can be replaced for every task.

Quote:
For multiple stacks for multitasking on the 6502, your software can save and restore a stack pointer for each task. Since there's only a 256-byte space to work with, you'd divide that up into equal segments, giving each task its own segment. No extra hardware is needed.

The GeckOS operating system http://www.6502.org/users/andre/osa can use this MMU, but can also split the stack for multiple tasks in an MMU-less system (like the C64).

Quote:
Quote:
Or you could possibly halt the main processor and have the coprocessor (since we're getting into that kind of complexity) do the work of pulling in the data from disk or whatever.
The early Sun workstations (which used only 68000s) did this, because the 68000 lacked any facility to properly restart an instruction. It wasn't until the 68010 that this capability came about.


And, upcoming January, is a CS/A board (with a 6502) that halts the main 6502 CPU when a bus error occurs and takes over the bus to resolve the bus error.

Quote:
Yep. Note that multi-byte operations (most 2-byte and 3-byte instructions) only have _SYNC asserted on the first byte; it's negated on the other two bytes. Therefore, your MMU circuitry would need to properly track this.

The upcoming board uses this feature to "trace" the main 6502, in that it halts the CPU on each opcode fetch and records the opcode address.

André


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 10 posts ] 

All times are UTC


Who is online

Users browsing this forum: No registered users and 8 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: