Joined: Fri Aug 30, 2002 1:09 am Posts: 8543 Location: Southern California
|
I've been watching this topic with interest, kind of watching for where it might go. It does mostly sound like a HLL; but even there, universality is absent in many HLLs. Even early BASICs had pretty significant differences between them, and as for Forth, like they say, "If you've seen one Forth, well, you've seen one Forth"—although there are good reasons for it, in spite of the lack of portability; in fact, ANS Forth in '94 was an effort to come up a newer and more-standard standard than the mishmash of earlier standards, and some parts of ANS present extra overhead in order to make it more portable across a wide range of processors, which is why I have not adopted it (or any later ones).
You've mentioned two accumulators a few times now. Do you have a specific use envisioned for them? I'm not sure I've ever wished to two accumulators, but I wouldn't mind having another register that duplicates the functions of X. The '816 allows 16-bit index registers (and allows a 16-bit accumulator too, independently), so at least indexing past 255 is possible there, and adds the stack-relative addressing modes. Its op-code table is full; so adding more registers and the op codes to use them would require a wider instruction word, or more operand bytes, or two-byte op codes and more complex (and probably also slower, as BigEd said) internal instruction decoding.
I'm not familiar with a wide inventory of processors, but it is my understanding that a major push for lots of registers was partly to make it easier to write compilers. However, just having more registers, even wider ones, does not guarantee performance, if I may point to the example of the RCA 1802 which had 16 16-bit registers and yet performed very poorly compared to the 6502, or the 32016 about which Sophie Wilson, chief architect of the ARM processor, said, "an 8MHz 32016 was completely trounced in performance terms by a 4MHz 6502." (The 32016 was National's 32-bit processor, having 15 registers, including 8 general-purpose 32-bit registers, and a 16-bit external data bus.) The 65816 even outperformed the 68000 and 8086 in the Sieve of Eratosthenes benchmark.
Even 40 years ago, a Z80 had to run at 3 or 4MHz to keep up with a 1MHz 6502; and Jack Crenshaw, an embedded-systems engineer who wrote regularly in Embedded Systems Programming magazine said in the 9/98 issue that he still couldn't figure out why, in BASIC benchmark after benchmark, the 6502 could outperform the Z80 which had more and bigger registers, a seemingly more powerful instruction set, and ran at higher clock rates. (The 6502's zero page and improved indexed and indirect addressing modes no doubt helped.)
Here on the forum, sark02 said, "My next computer was an Atari 800XL. [...] Coming from the Z80 I was initially dumbfounded by the criminal lack of registers, but BY GOD was it FAST! I spent hundreds of hours and wrote 1000s of lines of 6502 code over the following couple of years, diving deeper into the 800XL hardware and capabilities, and really came to enjoy the 6502." (See also this post.)
So my point again is that having more registers, even wider ones, is not necessarily helpful by itself. Other factors have to enter the picture.
As for extending the processor with external logic, there's Jeff Laughton's KimKlone 65c02 with pointer-arithmetic-friendly extended address space and 9-cycle ITC Forth NEXT. It gives 6 new registers and 44 new instructions.
Memory-speed bottlenecks would be another reason for the push for lots of onboard registers, and I, too, have contemplated having ZP and the page-1 stack onboard so as not to have to go off-chip for these, and in fact if they had their own buses internally, there could be more instruction overlap where for example a store or a push could happen while the next instruction is being fetched. OTOH, certain access techniques might be forfeited unless again the op-code set were expanded, again requiring two-byte op codes.
Quote: You can do things with A that you can't with X and Y. At least, directly. Self-modifying code can get you some of those things, like
- add X to A
- LDA (A), LDA (X), or LDA (Y)
- LDA (abs),X, LDA (abs),Y, LDX (abs),Y, and LDY (abs),X
- JMP ((abs)) (ie, doubly indirect)
- Save and restore a register one cycle faster than pushing and pulling, without using the stack, or even save and restore the stack pointer without using the stack or any variables
- LDA table,X,Y equivalent, ie, double-indexed; or even LDA (X+Y) !
and plenty more. See my article at http://wilsonminesco.com/SelfModCode/ .
I'm still kind of straining to understand the HAL's definition and application though.
_________________ http://WilsonMinesCo.com/ lots of 6502 resources The "second front page" is http://wilsonminesco.com/links.html . What's an additional VIA among friends, anyhow?
|
|