6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Wed Sep 25, 2024 3:19 pm

All times are UTC




Post new topic Reply to topic  [ 91 posts ]  Go to page Previous  1, 2, 3, 4, 5, 6, 7  Next
Author Message
PostPosted: Sun May 28, 2023 5:15 am 
Offline
User avatar

Joined: Sat Dec 01, 2018 1:53 pm
Posts: 727
Location: Tokyo, Japan
sark02 wrote:
My prioritized wish list:...2. (ZP),X instead of (ZP,X).

As well as what BDD mentioned, (ZP,X) is also used for smallish stacks in the zero page which can be very handy when writing interpreters (and is heavily used by several FORTH interpreters, IIRC).

sark02 wrote:
3. Unconditional BRA.

Did you have in mind a branch instruction to remove from the %xxx10000 column of instructions? If not, this is going to add not insignificant complexity to the instruction decoding. At which point perhaps adding both BRA and BSR (the 6800 has both) is very little more expense.

wayfarer wrote:
I again advocate for a 3rd "Z" register that can be even further reduced from Y, as Y is to X, as X is to A. This would not be a full Z register, just something to help the robotic automation of Detroit auto factories.

I don't understand what that means, in particular the "reduced from" part. I don't see how Y is "reduced from X"; they seem the same thing to me. They inherently select different addressing modes, but that just saves a bit for that in the instruction that would allow both to be used for either mode. Nor do I understanding how either is "reduced from A"; A has a completely different purpose and a different set of things you can do with it. (Not more, just different; there are things you can do with X and Y that you can't do with A).

wayfarer wrote:
Id also try like crazy to get some of the base, low level memory (zero page, stack) onto the die wherever possible.

That's been done with the 6800 (the 6802 added 128 bytes of on-board RAM), but it doesn't make any real difference except to system designers that can live with 128 bytes of RAM and thus can save the cost of having any external RAM. In particular, it doesn't change the speed of the processor at all.

_________________
Curt J. Sampson - github.com/0cjs


Top
 Profile  
Reply with quote  
PostPosted: Sun May 28, 2023 5:43 am 
Offline

Joined: Tue Nov 10, 2015 5:46 am
Posts: 228
Location: Kent, UK
cjs wrote:
sark02 wrote:
3. Unconditional BRA.

Did you have in mind a branch instruction to remove from the %xxx10000 column of instructions?

No, I meant a simple BRA just like what the 65C02 has.


Top
 Profile  
Reply with quote  
PostPosted: Sun May 28, 2023 6:02 am 
Offline
User avatar

Joined: Fri Aug 03, 2018 8:52 am
Posts: 746
Location: Germany
wayfarer wrote:
I am interested in the hub-bub about using zero page as 'registers'.
[...]
Id also try like crazy to get some of the base, low level memory (zero page, stack) onto the die wherever possible.

i did try that one. i wrote a 65C02 emulator in C, gave it an internal Zeropage and Stack (both can do 16-bit reads in a single cycle, the stack can also do 16-bit writes in a single cycle), and also gave it an adjustable instruction queue similar to the 8086/8088, and finally an adjustable data bus width.
viewtopic.php?f=8&t=7499

the CPU is internally using a harvard-like architecture where there are 2 data and addresses busses for instruction fetching (which goes to the queue) and instruction execution (which goes to the ALU/Registers). and since there are 3 discrete types of memory (zeropage, stack, and external (through the CPU's pins)) the CPU can both fetch instructions and do data memory accesses at the same time, as long as they don't try to access the same type of memory.

so for example, if you have a program or function that only uses the zeropage or stack, and all of the instructions are located in external memory, then the CPU can fetch new instructions into the queue pretty much every cycle.

and i think that having a small instruction queue, even without internal RAM, would already help a bit with performance. as suddendly all dead cycles can now be used to pre-fetch instructions. you could even add more "dead" cycles to some complicated instructions to add more chances to pre-fetch more instructions and overall reduce the amount of work that would otherwise be done in a single cycle, which could increase the maximum operating frequency.


Top
 Profile  
Reply with quote  
PostPosted: Sun May 28, 2023 6:08 am 
Offline

Joined: Tue Nov 10, 2015 5:46 am
Posts: 228
Location: Kent, UK
BigDumbDinosaur wrote:
Guess you haven’t done much device driver programming. :D

I haven't. I spent about 2 years in the early 80s as a teenager writing games on the Atari 800XL. My games had their technical elements, making competent use of the hardware, but the programming itself was unsophisticated and mostly array-based.

Nowadays I dabble from time to time, but not even to the extent I did back then.

But the question was what I would change, and through my own limited lens, (ZP,X) gets the chop.


Top
 Profile  
Reply with quote  
PostPosted: Sun May 28, 2023 7:08 am 
Offline
User avatar

Joined: Fri Aug 30, 2002 1:09 am
Posts: 8517
Location: Southern California
sark02 wrote:
But the question was what I would change, and through my own limited lens, (ZP,X) gets the chop.

As Curt said (and I see I also wrote on page 1 of this topic), the Forth programming language uses (ZP,X) constantly; and I understand Forth is not the only one.  The portion of ZP used for the data stack should be thought of as a stack, a kind of array, rather than a table, and X is the stack pointer, incremented and decremented by INX and DEX.  It might seem terrible to give up X for this; but when the programming is data-stack-oriented, there is a reduced need to use X for other things.  I think that when people don't see the usefulness of (ZP,X), it's because they're thinking of a table.  This is separate from the return stack (which uses the 6502's page-1 hardware stack), and it makes many operations much more practical than trying to pass parameters between routines on the hardware stack in page 1.  Even though it's called the "data stack," byte pairs on this stack can be anything, whether for example a constant that you put there, an address of a variable or routine or string, a pointer into an array, math inputs and outputs, flags, etc..

It works particularly well in ZP because of the extra addressing modes, especially that most of the accumulator-oriented instructions have a (ZP,X) addressing mode but not an (abs,X) addressing mode.  It's almost like the 6502 was designed for Forth (although I'm sure Chuck Peddle and Bill Mensch were not thinking of Forth when they designed the 6502).  You can for example calculate an address on the data stack and then read the contents of that address with LDA (ZP,X).  There is no corresponding LDA (abs,X), so obviously it works better to have the data stack in ZP.  As a plus, using this method, X seldom needs to be saved and used for anything other than the stack pointer; so you don't need to lament the loss of your X.  X won't often need to be saved and restored for other things.  And yes, the stack takes precious ZP space; but it also reduces the need for so many variables to be in ZP.  A ZP data stack provides unexpected solutions, while avoiding, or even more than compensating for, the penalties it might initially appear to incur.

I go into these and lots more in the treatise on 6502 stacks, whose chapters are indexed at http://wilsonminesco.com/stacks/ .  It mentions Forth quite a few times, but it's really about doing this stuff in assembly language.  You don't have to use Forth to help yourself to these tools.

I would comment however, Curt, that Forth programs are nearly always compiled, not interpreted.  Of the several threading methods, the only one I would call "interpreted" is token-threaded code (TTC), which is the most memory-efficient but is rare because it has the poorest performance due to having to look up addresses of routines at run time.  Indirect threaded code (ITC) is probably the most common on the 65xx, followed by direct-threaded code (DTC).  What is compiled in these is mostly a list of addresses.  In the Forth section of this forum there are a couple of subroutine-threaded code (STC) 65xx Forth models being discussed.  STC is all compiled as machine-language instructions instead of addresses (let alone tokens).  It will have a lot of JSRs; but for short things where you want to avoid the JSR-RTS overhead, the machine code can be inlined.  In all these cases, the data stack will be in ZP and use (ZP,X).

_________________
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?


Top
 Profile  
Reply with quote  
PostPosted: Sun May 28, 2023 2:27 pm 
Offline
User avatar

Joined: Fri Dec 11, 2009 3:50 pm
Posts: 3367
Location: Ontario, Canada
A quick 1+ for what Garth said.

You won't lament the loss of X; you'll be rewarded by its utility in a role that's similar in some ways to that of a frame pointer! Passing parameters back and forth between routines is far more easy and natural :) compared with doing so via the hardware stack in page 1. :|

I'm tempted to reiterate further, but Garth's post already states the case very nicely. Worth reading (not just skimming)!

-- Jeff

_________________
In 1988 my 65C02 got six new registers and 44 new full-speed instructions!
https://laughtonelectronics.com/Arcana/ ... mmary.html


Top
 Profile  
Reply with quote  
PostPosted: Sun May 28, 2023 4:01 pm 
Offline

Joined: Tue Nov 10, 2015 5:46 am
Posts: 228
Location: Kent, UK
GARTHWILSON wrote:
sark02 wrote:
But the question was what I would change, and through my own limited lens, (ZP,X) gets the chop.
As Curt said (and I see I also wrote on page 1 of this topic), the Forth programming language uses (ZP,X) constantly; and I understand Forth is not the only one.
Yes, understood. As I mentioned in my post, I see in MSBASIC using it, and I did read your page 1 post, as well as this one, as well as many other Forth-related posts in the past that mention it. None of those have ever been directly relevant to the code I've written.

I don't doubt that it's useful, but I'm making a personal choice to prefer (ZP),X, as that is what I wanted back when I wrote a lot of code. I respect that it wouldn't be your choice. But it's mine.

I know you're trying to convince me that my choice is in error, but until the day I use it personally, and go, "Ahh... gotcha."... my choice stands.

Hope that's ok. You go ahead and continue to enjoy (ZP,X) and I'll continue to wish it were (ZP),X.


Top
 Profile  
Reply with quote  
PostPosted: Mon May 29, 2023 3:31 am 
Offline
User avatar

Joined: Fri Aug 30, 2002 1:09 am
Posts: 8517
Location: Southern California
The problem is when people say, "I don't use instruction XYZ, so it should be taken out so nobody can use it."  It keeps happening.  (History is history though, and we're not going to change what Peddle and Mensch did; but it's interesting when enthusiasts explore and discuss doing a very 6502-like processor in programmable logic.)  There are instructions or addressing modes I have used little or none in my applications.  (ZP),Y is one I've used very little.  BRK is one I've never used since that first 6502 class in 1982 where we used AIM-65 computers in the lab class, and we were taught to use BRK to get back to the monitor at the end of our assigned routines.  I'm not sure I've ever used CLV, or JMP (abs,X) (in the 65c02), and there may be instruction/address-mode combinations I haven't used.  I won't suggest they should never have been there though.  Ed has a list of threads for improved 6502 and derived architectures here.  One that's not there (yet) is Seeking Feedback on Custom CPU Design Based on 6502.

_________________
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?


Top
 Profile  
Reply with quote  
PostPosted: Mon May 29, 2023 4:39 am 
Offline

Joined: Tue Nov 10, 2015 5:46 am
Posts: 228
Location: Kent, UK
Garth, don't over-think it. The rules of the game were to suggest changes balanced by trade-offs. I didn't ask for the very useful PHY, PLY, STZ, and other useful additions from the 65C02 as the only thing I wanted to drop from the 1975 design was decimal mode. At no point did I say (ZP,X) isn't useful. Clearly it is. But it's something I would trade.

I'm a novice level programmer here, and those with vastly more 6502 experience agree with you that mine is a terrible idea. I'm comfortable with it.


Top
 Profile  
Reply with quote  
PostPosted: Mon May 29, 2023 1:01 pm 
Offline

Joined: Sun Mar 19, 2023 2:04 pm
Posts: 137
Location: about an hour outside of Springfield
I actually found a chip that does most of what I was thinking, it was released in 1988, so still 'in the era' I guess.

the 65CE02, from CSG, located here on the archives.
http://www.6502.org/documents/datasheets/mos/mos_65ce02_mpu.pdf
it had a Z register and a relocatable stack. Further, it had some improvements to its pre-fetch cycle, with the same nods towards Scalar pipelining I had mentioned elsewhere.

65CE02 seems to have a few things the 65816 had, in an 8-bit package. Super cool. A couple of people have or are working on 'cores' for the 65CE02, I will be studying them for any custom chip/core developments!!


Top
 Profile  
Reply with quote  
PostPosted: Mon May 29, 2023 1:52 pm 
Offline
User avatar

Joined: Tue Oct 25, 2016 8:56 pm
Posts: 362
wayfarer wrote:
I actually found a chip that does most of what I was thinking, it was released in 1988, so still 'in the era' I guess.

the 65CE02, from CSG, located here on the archives.
http://www.6502.org/documents/datasheets/mos/mos_65ce02_mpu.pdf
it had a Z register and a relocatable stack. Further, it had some improvements to its pre-fetch cycle, with the same nods towards Scalar pipelining I had mentioned elsewhere.

65CE02 seems to have a few things the 65816 had, in an 8-bit package. Super cool. A couple of people have or are working on 'cores' for the 65CE02, I will be studying them for any custom chip/core developments!!



Indeed, I have a great fondness for the 65CE02. I think it addresses a lot of the 6502 and 65C02's shortcomings whilst fundamentally remaining true to the original, unlike the more drastic measures of the 65816 (which are mainly in service of the larger address space). And before Garth and BDD jump on me, yes I am fully aware you don't have to use the 65816's additional features if you don't want to: The point is I view the 65CE02 as being fundamentally a 6502 with the flaws fixed, whereas the 65816 is a different beast altogether. I do think its a pity that the 65CE02 saw so little real-world usage, and that the chips are no longer available. a WDC version of the 65CE02 would be great.

_________________
Want to design a PCB for your project? I strongly recommend KiCad. Its free, its multiplatform, and its easy to learn!
Also, I maintain KiCad libraries of Retro Computing and Arduino components you might find useful.


Top
 Profile  
Reply with quote  
PostPosted: Mon May 29, 2023 2:44 pm 
Offline
User avatar

Joined: Sat Dec 01, 2018 1:53 pm
Posts: 727
Location: Tokyo, Japan
Alarm Siren wrote:
Indeed, I have a great fondness for the 65CE02. I think it addresses a lot of the 6502 and 65C02's shortcomings whilst fundamentally remaining true to the original, unlike the more drastic measures of the 65816 (which are mainly in service of the larger address space).

I'm not so sure. Consider, for example, the lack of a BRA instruction in the original 6502. That wasn't a flaw, that was a feature: to keep the implementation as cheap as possible they determined that there would be only eight relative branch instructions, and BRA and BSR one ones that they chose to drop.

I see the MOS 6500 series CPUs as a design intended to be somewhat compatible with the 6800 series as a secondary objective (you'll note that the bus protocol is the same, and the 6801 even had the same ability to tri-state the bus) and being much cheaper to implement as a primary objective. Pretty much every change from the 6800 appears to be to make it cheaper or faster. (The switch from big-endian to little-endian is an example of giving it a speed advantage.)

The primary design criterion of the 65C02 and later CPUs was compatibility with the (documented) design of the 6502. But if the original 6502 designers had had the transistor and space budget that the 65C02 designers had, would they have designed the same CPU? I think probably not.

_________________
Curt J. Sampson - github.com/0cjs


Top
 Profile  
Reply with quote  
PostPosted: Mon May 29, 2023 4:49 pm 
Offline

Joined: Tue Nov 10, 2015 5:46 am
Posts: 228
Location: Kent, UK
cjs wrote:
Consider, for example, the lack of a BRA instruction in the original 6502. That wasn't a flaw, that was a feature
I LOLd. No, just no. That's what a marketeer would say.

Quote:
: to keep the implementation as cheap as possible they determined that there would be only eight relative branch instructions, and BRA and BSR one ones that they chose to drop.
It was a trade-off to hit a budget.

When comparing CPU specs, I doubt an engineer of the time would say, "I'm looking for a processor that doesn't have a BRA instruction. That's a feature I'm interested in." No... they'd say, "I'm looking for a processor that will let me hit my dollar target", and dropping features is what let them hit that target.


Top
 Profile  
Reply with quote  
PostPosted: Mon May 29, 2023 5:02 pm 
Offline

Joined: Tue Nov 10, 2015 5:46 am
Posts: 228
Location: Kent, UK
wayfarer wrote:
I actually found a chip that does most of what I was thinking, it was released in 1988, so still 'in the era' I guess.
the 65CE02, from CSG, located here on the archives.
Oh that's a really nice find. The Z register hits exactly what I was looking for with my (terrible) (ZP),X idea, and BRA. And it has some really nice features that wayfarer mentions.

Coming in 1988, 13 years after the 6502, and in the era of the 68000-based machines, it doesn't look like it had much of a life.
From Wikipedia: The 65CE02 was used in the Commodore A2232 serial port card for the Amiga computer. That was it? What a waste.


Top
 Profile  
Reply with quote  
PostPosted: Mon May 29, 2023 5:24 pm 
Offline
User avatar

Joined: Fri Aug 03, 2018 8:52 am
Posts: 746
Location: Germany
wayfarer wrote:
A couple of people have or are working on 'cores' for the 65CE02, I will be studying them for any custom chip/core developments!!

nice find! i made one of those 65CE02/816 hybrid cores. though i haven't worked on it for a while.

Alarm Siren wrote:
Indeed, I have a great fondness for the 65CE02. I think it addresses a lot of the 6502 and 65C02's shortcomings whilst fundamentally remaining true to the original, unlike the more drastic measures of the 65816 (which are mainly in service of the larger address space). And before Garth and BDD jump on me, yes I am fully aware you don't have to use the 65816's additional features if you don't want to: The point is I view the 65CE02 as being fundamentally a 6502 with the flaws fixed, whereas the 65816 is a different beast altogether. I do think its a pity that the 65CE02 saw so little real-world usage, and that the chips are no longer available. a WDC version of the 65CE02 would be great.

yeah, sometimes i feel like the modern day WDC65C02S should've been the WDC65CE02S instead. BSR/BRL, relocatable stack and zeropage, stack indirect addressing modes, and other useful features... it would've been amazing!
especially for Operating Systems like FUZIX or GeckOS where fully relocatable code and data structures are pretty useful.
but sadly that is not the timeline we live in... :(

though there are a few things i would change about the 65CE02:
for example i would remove INW, DEW, ASW, and ROW (Word Increment/Decrement/Shift Left/Rotate Left) and replace them with ICC and DCC (Increment/Decrement with Carry) with both Basepage and Absolute addressing modes. ICC and DCC are functionally identical to "LDA ADC #0 STA" and "LDA SBC #0 STA" but don't modify the accumulator. so you can chain them to increment/decrement multi-byte words in memory (also works with BCD).

i would also change the way RTN # works. as described in the datasheet/other sites, RTN pulls # bytes off the stack (and discards them), and then reads the return address. with the idea being that you can easily get rid of a stack frame made during a function's execution.
but i would argue that having it the other way around is more useful. first pull the return address THEN pull # bytes off the stack (and discards them). with the idea being that you can remove function arguments that were pushed before the function was called using a single instruction.

i say it's more useful like that because while you can emulate either functionality manually using a macro, pulling bytes first and then returning is a lot easier to emulate than first getting the return address and then pulling bytes. so by having the instruction implement the more difficult/complex one, it means that if the other way is required it can be replicated much easier and with less code.

cjs wrote:
to keep the implementation as cheap as possible they determined that there would be only eight relative branch instructions, and BRA and BSR one ones that they chose to drop.

honestly if i had the choice between JSR and BSR, i would probably drop JSR and make BSR use a 16-bit relative offset. it's functionally the same (in most scenarios) as JSR but allows for position independent code. same with JMP and BRA or rather BRL (branch always long), i would keep BRL and remove JMP. but i'd keep the Indirect JMP's absolute, so you can still make use of function tables within a ROM regardless of where the code is located in RAM.

but of course i know that with the goal of staying as low cost as possible, having JMPs and JSRs always do a 16-bit addition with the relative offset and the PC would add quite a bit of complexity and cost. but a man can dream :D


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 91 posts ]  Go to page Previous  1, 2, 3, 4, 5, 6, 7  Next

All times are UTC


Who is online

Users browsing this forum: No registered users and 43 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: