GARTHWILSON wrote:
Wouldn't the Athlon have to do basically more MIPS to get the job done in the same amount of time, just because it's so complex you almost can't code it efficiently in assembly, and have to resort to a higher-level compiled language, meaning the gains are not as great as they initially appear? (Or maybe you were already taking that into account?)
I see two possible meanings of "efficiency" here.
MEANING 1: Single-cycle instruction execution. Answer: no. It already achieves this, and in fact, Intel has been executing its instructions in a single cycle as far back as the 80486.
MEANING 2: An intelligently designed, complex instruction set that actually gives evidence for the benefits of CISC over RISC, instead of vice versa. Answer: Yes, in some cases.
The 6502/65816, for example, will automatically set flags based on virtually anything coming in off the data bus or ALU output. It seems rare when the CPU inhibits all flags updates. Thus, you don't often see code like:
Code:
LDA someVariable
ORA #0 ; set flags
BEQ fooBar
However, you
do see this with Intel's instruction set. Intel's instruction set, you might say, is more modular, and as with anything modular, you need more glue to put the pieces together. So,
Code:
MOV EAX,someVariable
OR EAX,EAX
JZ fooBar
becomes a frequent coding experience.
However, there's a flip-side to this. I'm sure I'm not the only one who has found writing flags-transparent code annoying:
Code:
PHP
LDA foo
ADC bar
STA baz
PLP
The x86 architecture provides fewer cases this becomes a problem.
The degree to which the Intel-architecture processor needs to execute more MIPS than a comparable 65816 depends on how frequently you depend on things like automatic flag settings.
On the other hand, Intel architecture supports:
* multi-bit rotates and shifts, with and without carry, in a single cycle.
* explicit I/O and memory address spaces, which saves on bank-switching costs on constrained systems. You rarely see code fragments in Intel programs correlating with 6510 code like LDA $01:PHA:LDA #xx:STA $01:....
![Razz :P](./images/smilies/icon_razz.gif)
LA:STA $01
*
real indirect addressing modes for JMP and CALL targets, permitting no more than two-cycle (assuming cache hit) vectored execution, which OO- and functional-programming code relies on extensively.
* Any CPU register can be used as a base address and/or index (saving the need for TAX/TAY instructions), complete with power-of-two scaling (saving the need for TXA,ASL,TAX sequences).
* Saving registers to, and popping registers from, the stack are single cycle, not 3 to 4.
* More useful set of conditional branches, testing for <, <=, =, /=, >=, and > in both signed and unsigned variants.
* Floating point instructions.
So, yes, it's true that Intel-architecture CPUs need more instructions to do what the 65816 can do in one well-chosen instruction. However, exploiting other unique features of the x86 architecture can more than make up for that deficiency. It all depends on the kind of program you're writing.