6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Fri May 10, 2024 12:36 pm

All times are UTC




Post new topic Reply to topic  [ 12 posts ] 
Author Message
PostPosted: Tue May 07, 2024 5:51 pm 
Offline

Joined: Sat Oct 28, 2023 7:57 pm
Posts: 20
Location: Missouri
Hey all, I was thinking about the programming model for the 6502, and had an (almost certainly overly-naive) idea for an approach to making a 16-bit variant with minimal changes to the actual function of the chip.

Basically, take the 65C02 and make the following changes:
  • Address Bus and Program Counter expand from 16->24 bits
  • The data bus, ALU, and all registers expand from 8->16 bits
  • Opcodes remain unchanged, with bits 9-16 being set to "0", with the following exceptions:
    • in ZP Address mode bits 9-16 form the ZP address
    • in Absolute Address mode, bits 9-16 form the LSB of the address, with the rest loaded by the next 16-bit word

Perks:
  • I would think it would be relatively simple to implement on top of the design for a standard 6502
  • 64KB stack space, instead of 256 bytes.
  • Similar mental programming model to 6502
  • 16MB address space without circuitry hassle of the 65816
  • Many instructions should run with one fewer clock cycle, I believe, due to compressed memory lookup

I'm sure there would probably be issues with this approach I'm not aware of or thinking of, but I'd love to hear them to improve my understanding of the 6502 (and maybe make this concept something worth implementing someday when my FPGA skills are improved some)


Top
 Profile  
Reply with quote  
PostPosted: Tue May 07, 2024 6:45 pm 
Offline
User avatar

Joined: Mon Aug 30, 2021 11:52 am
Posts: 261
Location: South Africa
WCMiller wrote:
Perks:
  • 16MB address space without circuitry hassle of the 65816
I'm not sure how far you want to take this or how seriously you're thinking about implementing this in an FPGA but... here goes!

As you're using a 16bit data bus I'd say use the '816 as a base rather than the '02 but do away with the bank / data multiplexing. And then do away with the need for switching between 16bit and 8bit index / memory widths. 16bit opcodes have more than enough options to encode the width in the opcode. And the '816 has so many useful instructions that even the 65C02 doesn't. Specifically I want the movable Direct Page (previously Zero Page) that it provides. It is so, so very useful that I really struggle to go back to the '02.

More so, with the 16bit opcode availability why not let any register provide the Direct Page offset? And as I'm just throwing out wild ideas how about letting any register be used as stack pointers (for programmatic pushes and pops)?

And then why only allow arithmetic on the accumulator? And whilst we're at it I'd really like two's complement arithmetic (with a few caveats).

But I think you see where this is going. Because I have no real restrictions other than "Wheeee! this would be cool!" it's very easy for me suggest things that are well outside of what you are thinking of or would want to do.

So to bring it back to earth. A fully 16bit address bus and ALU does already sound very cool. Possibly my only suggestion would be extend the internal registers to 32bits and treat the entire 24bits of address space as single non-segmented flat memory. That honestly just makes programming much easier.


Top
 Profile  
Reply with quote  
PostPosted: Tue May 07, 2024 7:01 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10800
Location: England
Just for ref, we've visited ideas like this many times in the past, and while there's always an interesting new idea, it's worth looking over what's happened in the past, in my view:
Index of threads for improved 6502 and derived architectures

That's not a definitive index, of course - it's quite old now, for one thing.


Top
 Profile  
Reply with quote  
PostPosted: Tue May 07, 2024 7:06 pm 
Offline

Joined: Tue Sep 03, 2002 12:58 pm
Posts: 298
This sounds like a similar starting point to my 65020 design, but heading in a different direction with different goals. Mine was a reaction to the 65816, trying to imagine how the 6502 could have made the leap out of the 8 bit era without turning so ugly.

With registers extending to 16 bits, you either accept that it's not going to be compatible, or you need some way of requesting 8 bit operations. The 65816's solution was mode bits, mine was bits in the top half of the opcode.

With zero 'page' being 64K, there's no need for a direct page register. Code does indeed end up a little bit smaller (if you're counting memory locations rather than bits) and faster.

I went a lot further than you're planning, with more (and wider) registers, and operations between registers without going to memory. The result no longer feels like a 6502, but it gives me the same kind of joy that the 6502 did.

There will be a lot of details that you'll need to work out, but I think you'll find it worth the effort.


Top
 Profile  
Reply with quote  
PostPosted: Tue May 07, 2024 7:28 pm 
Offline

Joined: Sat Oct 28, 2023 7:57 pm
Posts: 20
Location: Missouri
@AndrewP Right now, this is more in the "hey, this might be fun" stage than any actual plans (I'll need to do things like learn programming FPGAs, for one!), but eventually I think it'd be fun to make this an actual thing (or at least an emulated thing). I actually had some thoughts for an 816 variant of this idea (with a 32 bit ALU/Registers/Address Bus), as well as a balls-to-the-wall 6502-derivative with all sorts of things added. With this one, however, I'm trying to keep the changes as minimal as possible and as similar to the 6502 as I can, conceptually. I do like the idea of adding a register that provides for a movable direct page; I'll have to keep that in mind.

@BigEd I'd actually seen that thread a while back but forgot about it. It'll definitely make for good reading!

@John West I figured one easy way I could make things compatible is to set it to ignore the upper 8 bits of each word (with some complications for things like the carry bits). As I said, the opcodes would be identical to a regular 6502 (except if I add any, which would also be 8-bits long). The zero page would still be 256 bytes, so a direct page register might be nice.

Just to be clear, as I fear it wasn't, the least significant 8 bits of an address is loaded with the opcode, so, for example
Code:
.org $800000
stx $123456 ; store-x in absolute address mode

would be assembled to (I believe)
Code:
$800000 $8E $56
$800001 $34 $12

and
Code:
.org $800000
stx $12; store-x in zero-page address mode

would be assembled to
Code:
$800000 $86 $12


That's why it only has a 24-bit address bus; 16 from an argument and 8 that hitch along after the opcode.


Top
 Profile  
Reply with quote  
PostPosted: Wed May 08, 2024 9:00 am 
Offline
User avatar

Joined: Mon Aug 30, 2021 11:52 am
Posts: 261
Location: South Africa
WCMiller wrote:
That's why it only has a 24-bit address bus; 16 from an argument and 8 that hitch along after the opcode.
Sorry, I missed that. Kinda. I think I saw Zero Page address included in (16bit) op-code and thought: "Trying to add the Direct Page register to the zero page address before the cycle ends so it can be used as an actual address on the address bus is way to complicated for me". And stopped thinking. Which was wrong because I was still assuming a movable Direct Page and never read the bit where absolute addressing would also have 8bits of the address included in the op-code.

That makes for much more efficient memory access than I had realised. i.e. a direct Zero Page 16bit ADC could be done in two cycles compared to the 4 or 5 it takes the '816. Nice!

Like John, I'm still wondering how to deal with 8bit data / operations without resorting to instructions to set the memory width state. But with that said (on the '816) I rarely change to 8bit memory unless I'm dealing with data that has to be processed 8bits at a time. Think pixels or text. And I basically never switch the Index registers to 8bit so possibly that mode could just be entirely ignored.

I do want to harp on about why a movable Direct Page is so useful for a bit more.

I think it's fairly typical to view Zero Page as processor registers that just happen to live outside the processor. And that means, like any processor, when doing a function call the state of those registers may need to be saved. On the 6502 with its fairly small call depths I think that's generally done by giving each function its own section of Zero Page and assuming no recursion and that two functions won't stomp on each others state.

But as programs get bigger, think a full 24bits of memory, it's going to become harder and harder to ensure each function plays nicely only in its own tiny piece of Zero Page.

An obvious solution is to save the Zero Page 'registers' onto the Stack; and then restore them when the function completes. But that's a lot of pushing and popping.

A far nicer solution is have a movable Direct Page and slide it down memory as functions are called. This implicitly saves the state of the calling function because its Direct Page address is no longer accessible because the Direct Page has been moved on. And implicitly restores the state of the calling function when its Direct Page is put back where it was when the called function returns. As a bit of an '816 aside: because the '816 has limited Stack addressing I'll push function arguments onto the stack and then set Direct Page to point to the Stack and address those parameters from the Direct Page instead. Very useful and it allows the stack to continue to be used as a stack.


Top
 Profile  
Reply with quote  
PostPosted: Wed May 08, 2024 9:40 am 
Offline

Joined: Mon Jan 19, 2004 12:49 pm
Posts: 683
Location: Potsdam, DE
Does this let you use a stack frame? So you would define variables on your (much larger) stack so that input variables remain at a fixed offset above the current stack pointer, and local variables appear below the stack pointer?

With a 6502 it's messy: transfer the stack pointer to X, and then offset from 102,x 104,x etc and then some arithmetic to sort out any variables you've eaten when you return - it really needs the calling routine to tidy up the stack and it's slow... I suppose it could work the same way but with a 16 bit X register?

8086 has a base pointer which you set on entry to your subroutine, and can also return and eat n stack entries in one instruction... that's very useful.

Neil


Top
 Profile  
Reply with quote  
PostPosted: Wed May 08, 2024 9:45 am 
Offline

Joined: Tue Sep 03, 2002 12:58 pm
Posts: 298
I missed that too. So zero page will only be 256 bytes, and that does make the ability to move it useful. But packing the address into the opcode will make a big difference to the speed.

It's an interesting idea, and I think one worth pursuing. My recommendation is to start with a software simulator, particularly if you've never worked with FPGAs before. It's a lot easier to experiment with changes to the instruction set there.


Top
 Profile  
Reply with quote  
PostPosted: Wed May 08, 2024 11:23 am 
Offline
User avatar

Joined: Mon Aug 30, 2021 11:52 am
Posts: 261
Location: South Africa
barnacle wrote:
Does this let you use a stack frame? So you would define variables on your (much larger) stack so that input variables remain at a fixed offset above the current stack pointer, and local variables appear below the stack pointer?
Yup, that's exactly using the Direct Page as a stack frame. Which (if my memory works - but it's been a while) is very similar to how the 16bit x86 BP register was used.

I guess it's convention but I would setup everything to appear above the stack pointer in memory (i.e. inside the stack). So my Direct Page calculation would be:

on entering a function from a JSR / JSL
transfer the Stack Pointer address to A
subtract the amount of memory I need for local variables
transfer A back to the Stack Pointer

and then

push the Direct Page (so I know what it was previously)
transfer A to the Direct Page

The assembly would look something like this:
Code:
FunctionThatDoesStuff:
;preamble
   TSC
   SEC
   SBC   #_Local_Variable_Space_needed
   TCS
   PHD
   TCD


Having all local variables and arguments inside the stack means the stack can still be used. Specifically for registers that can only be accessed via the stack: PHB, PLB, PHP, PLP and PHK

For completeness cleaning up and returning from the function is a bit more complicated because the return address for RTS / RTL needs to be the last thing popped off the stack (but unfortunately it sits after any arguments that were pushed)
Assuming a long return:

load A with return address (low and high bytes)
store A over the (first+1) bytes of argument
load A with return address (high and bank bytes)
store A over the first bytes of argument

then

pull the Direct Page
transfer the Stack Pointer address to A
add the amount of memory needed for local variables less 3 bytes
transfer A back to the Stack Pointer
return long

Again the assembly would look like
Code:
;postamble
   LDA   <_Local_Variable_Space_needed+2            ;RTL hi, RTL lo
   STA   <_Arguments-1
   LDA   <_Local_Variable_Space_needed+1            ;RTL b, rtl hi
   STA   <_Arguments-2
   PLD
   TSC
   CLC
   ADC   #_Arguments-3
   TCS
   RTL


Top
 Profile  
Reply with quote  
PostPosted: Wed May 08, 2024 1:00 pm 
Offline

Joined: Mon Jan 19, 2004 12:49 pm
Posts: 683
Location: Potsdam, DE
Hmm, I guess your calling routine must leave space on the stack for a return value before pushing the input parameters. Then it makes the call, pushing the return address. Any local variables are held on the stack below the return value, and on return, the return value is placed in the reserved place; the return address can either be moved to the immediate entry below the return value, or the calling routine can adjust the stack to jump up over the input parameters.

Keeping everything local below the return means the same approach can be used for any calls from the routine, and means that interrupts also work properly - that just looks like a normal stack.

Perhaps the call routine could automatically load a base pointer register? Or a TSPBP instruction.

Neil


Top
 Profile  
Reply with quote  
PostPosted: Thu May 09, 2024 12:30 am 
Offline

Joined: Sat Oct 28, 2023 7:57 pm
Posts: 20
Location: Missouri
@AndrewP, @John West, thanks for the... I guess validation of the "interesting-ness" of the opcode+memory mixing idea; it was the only part of this that I think actually had the potential to actually be "clever" and worth investigating and I'm encouraged that you two agreed! I'm really intrigued by the way the conversation is developing (mainly because I'm having a bit of trouble following all the nuances of how the conversation is unfolding, and one thing I wanted to do was push my understanding to help find areas of ignorance), but I don't know how productive I'll be contributing. It's interesting reading, though, trying to suss out what everything means and the implications as to why they're being discussed.


Top
 Profile  
Reply with quote  
PostPosted: Thu May 09, 2024 3:05 am 
Offline

Joined: Wed Jan 03, 2007 3:53 pm
Posts: 55
Location: Sunny So Cal
Quote:
Yup, that's exactly using the Direct Page as a stack frame.


You might want to look at the TMS 9900's Workspace Pointer for a similar notion, which can be anywhere in RAM.

_________________
Machine room: http://www.floodgap.com/etc/machines.html


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 12 posts ] 

All times are UTC


Who is online

Users browsing this forum: No registered users and 6 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: