Welcome, ggtgp.
ggtgp wrote:
Over on the net news group comp.arch I started a now extensive 65832 thread and wondered if the hardware guys here would be interested in contributing. (I am a software guy.)
https://groups.google.com/forum/#!forum/comp.archThat same head post was posted somewhere else too, right? I can't remember where, but I'm sure I've read it.
Since I'm not registered to post there, I'll respond here to things there in the order I find them. You can direct them to this post if you like. (It's perhaps
too relaxing to sit here late at night and do this, half-asleep.)
Most things here are ones that 6502.org old-timers have heard from me before.The 65Org32 however has all registers 32 bits wide, including the direct-page register, so it covers the entire 4GW memory span but serves as an offset. The exception might be the status register, as making use of 32 bits of status might take more imagination than I have.
I have no real desire for floating-point hardware myself, having found through experience that nearly everything can be done in fixed-point or scaled-integer more efficiently. The "how" is discussed at the beginning of my web page on "
Large look-up tables for hyperfast, accurate 16-Bit scaled-integer math, including trig & log functions." I realize there are a few legitimate applications for floating-point, and I won't hold anything against those who need it, but my own experience says more and more than it is seldom necessary.
The matter of the two stacks mentioned at your link is not a problem at all for 6502. The return stack is the normal hardware stack in page 1, and the data stack goes in page 0, with X as the pointer, taking advantage of the added ZP addressing modes. It's almost like the 6502 was made for it, except that having the 16-bit cells, the performance is not as good with an 8-bit processor as it is with a 16-bit. The 65816 mostly qualifies, although the data bus is still 8-bit. My '816 Forth runs two to three times as fast as my '02 Forth at a given clock speed. The 816's stack-relative addressing is nice for many operations, not just in Forth. I can see a use for two hardware stacks, but for a different reason. Actually, that would get Forth's DTC NEXT down to 6 cyccles.
I'm glad to see Mike is on there, discussing his 65m32. He did write however:
Quote:
When the 6502 needed to operate on 16-bit data and addresses, it had to use about twice as many instructions as the 8-bit versions of the same.
I'd say it's considerably worse than twice. See my example at
viewtopic.php?f=9&t=1505&p=9705#p9705 . [Edit: I see he mentioned me!
]
Anton wrote:
Quote:
Before me I have the code for fig-Forth's NEXT (the interpreter dispatch loop, performed once for each virtual machine instruction). On the 6502 it's 12 instructions and 39 cycles and on the 6809 it's 2 instruction and 14 cycles. Sure, the 6502 has a lower CPI for this sequence (3.25 vs. 7), but overall it's slower by a factor >2.7, so yes, it was significantly slower clock for clock (and clock rates were similar, so it was also slower in seconds).
The 6809 did have a nice way to do NEXT. The 65816 was much better in that regard than the 6502's also. Note however that the 6502 (and '816 too) has achieved clock rates that are astronomical compared to the 6809's.
Regarding the number of registers: As I'm reading there, 8 is getting talked about a lot. The 65816 already has 9: C (splittable into A and B), X, Y, S, P, DP, PB, DB, and PC, although they're not general-purpose. More general-purpose ones help with compilers, but don't seem to be particularly useful for assembly. As BigEd here observed, "With 6502, I suspect more than one
beginner has wondered why they can't do arithmetic or logic operations on X or Y, or struggled to remember which addressing modes use which of the two. And then the
intermediate 6502 programmer will be loading and saving X and Y while the
expert always seems to have the right values already in place." I do use Forth a lot in my work, and the 6502/816 do it well; but I have also brought the level of my assembly language way, way up, using
program-structure macros to incorporate things like IF...ELSE_...END_IF, BEGIN...WHILE...REPEAT, FOR...NEXT, CASE, etc., in assembly, which dramatically improves programmer productivity, while retaining the performance advantage of assembly.
As for using registers to pass arguments, if I ever get my 6502 treatise on stacks finished, it will have a large body of code for passing inputs and outputs on a ZP data stack, without particularly using Forth (although if you want to write a STC Forth with it, much of the work will already be done for you). The data-stack method gets rid of the conflicts with what registers are being used by what and accidentally overwriting something that is still needed.
George wrote:
Quote:
Using DP pseudo-address registers on the 65816 was a problem in itself because there was no 24-bit indirect addressing mode: you had to manually set the data bank register to the high byte of the address [which could be done only through the stack] before doing a 16-bit indirect operation
There were, but the pointer was in DP, like LDA[DP] (op code A7) and LDA[DP],Y (op code B7). Admittedly, there was not every combination and permutation of indirects and indexings though. If you're not doing it often enough for it to hurt performance significantly, make a macro to hide the recurring internal details.
and:
Quote:
The 6502 had only the 256 bytes in bank 1. That was too limiting for most code, so many programs were forced to maintain a software managed stack using a pseudo-register in the zero page.
From the
program tips page of my
6502 primer: A common criticism of the 6502 is that the stack space is so limiting. A few higher-level languages (notoriously Pascal) do put very large pieces of data and entire functions and procedures on the stack instead of just their addresses. For most programming though, the 6502's stack is much roomier than you'll ever need. When you know you're accessing the stacks constantly but don't know what the maximum depth is you're using, the tendency is to go overboard and keep upping your estimation, "just to be sure." I did this for years myself, and finally decided to do some tests to find out. I filled the 6502 stack area with a constant value (maybe it was 00-- I don't remember), ran a heavy-ish application with all the interrupts going too, did compiling, assembling, and interpreting while running other things in the background on interrupts, and after awhile looked to see how much of the stack area had been written on. It wasn't really much--
less than 20% of each of page 1 (return stack) and page 0 (data stack). This was in Forth, which makes heavy use of the stacks. The IRQ interrupt handlers were in Forth too, although the software RTC (run off a timer on NMI) was in assembly language.
Quote:
Does 65816 code pay lots of TAX and TAY?
If you mean, "Does 65816 code tend to use the TAX and TAY instructions a lot?" I guess I would say no; but the only real project I've done on the '816 is my '816 Forth. There does need to be a way to do the transfer though.
Brett's advice to get rid of the "garbage op codes" should probably have some explanation as to why, for example that they take too many resources in programmable logic or whatever. I find the ones he has referred to to be very useful. Perhaps he has a different way of doing it in mind, which is quite possible. I do a lot of work with PIC16's though, and even with their incredibly limited instruction sets and pipelining, they still allow incrementing or decrementing memory in a single 4-clock instruction, including for indirects (through INDF).
From John Stavard:
Quote:
Nostalgia?
It isn't a 6502 if it doesn't run every piece of 6502 software ever written. Upwards compatibility is essential if the intent is for the chip to provide powerful capabilities, but also work in an old Apple IIgs or even an old Commodore-64, just adding extra abilities to those machines.
I think most of us have no interest in that. You can't drop it into a 6502 socket if there are 32 data lines and at least 24 (non-multiplexed) address lines; and even if you could, the old hardware won't run at dozens (or hundreds) of MHz. For myself, if a 32-bit '816 becomes available, I'll make a new computer and run all new code on it. As Mike says, the "look and feel" are important, and right there lies the experience investment I want to protect. I can still run the old code on old computers.
and:
Quote:
So basically I agree with you that the Direct Page is the future of computing!
However, I went and looked up the 65816. I see one big problem.
It uses ***ALL 256 OPCODES***.
And so there seems to be no strictly compatible way to add an instruction to switch to the new 32-bit mode. And, also, none of the new 32-bit features will be accessible from 16-bit mode, so much unlike the case where 8-bit Emulation Mode allows access to many of the new features of 16-bit mode!!
No doubt, though, the people thinking of such a project have already figured out a clever way around this problem, and so they don't need me to suggest one.
The 65Org32 has no 16-bit mode or 8-bit mode. A byte is 32 bits. You can still interface it to 8-bit peripherals, just as I also interfaced a 4-bit real-time clock to a 6502 in my earliest commercial design.
From Donbo:
Quote:
Frankly, despite fond memories a have of the 6502, I don't see the point of having a 32-bit version of it, regardless of how compatible it would be with the original 6502. The 6502 was great for its time but times have changed.
The main thing for me is handling larger data sizes. Even in the matter of a 32-bit loop counter for a looping structure, the 6502 takes something like 30 instructions (plus NEXT), whereas a 32-bit one would be as simple as DEX, BNE. I have code for the 32-bit DO LOOP and associated Forth words at
http://wilsonminesco.com/Forth/32DOLOOP.FTH , and a code example showing the big difference in code length between 6502 and 65816 for handling 16-bit quantities at
viewtopic.php?f=9&t=1505&p=9705#p9705.
Quote:
Of course, that's available now, since there won't _be_ any future versions of the 6502, except for amateur efforts.
Probably true, no commercial ones; but the 8-bit 6502 is being produced in absolutely huge quantities today-- hundreds of millions of units per year. You probably have quite a few you didn't know about, under the hood of your car, behind the dashboard, in personal electronics, appliances, etc..
Quote:
Is there a reason to stop at 32 bits and not go al the way to 64 bits?
A big interest of mine is Forth, where the 16 bits I've been using for years is usually enough but sometimes not quite. 32 will definitely be enough. I'm using it for controlling stuff on the workbench, with little human I/O. If you do a fast fourier transform (FFT) for example with 16-bit cells, you're limited to about 6 bits of input data precision on a 2,048-point complex FFT before you start having overflow problems. (An FFT lets you for example analyze noise and vibration, whether it's acoustical, machinery, etc..) The sizes of D/A and A/D converters and other things come into the picture, and 32 is definitely more than enough for the registers, for common uses. 4GW (16GB) of memory is even enough to hold a feature-length movie, something we won't be doing with this class of processor. The double-precision intermediate results however could be 64-bit, as when you take a 32-bit number and multiply it by another 32-bit number (getting a 64-bit number in a register) and divide the result by yet another 32-bit number to get a 32-bit result and possibly a 32-bit remainder.
Off to bed (again).