IamRob wrote:
This is only half the picture though. Almost every word also uses the data stack. The Y-reg being the free register here would be the logical choice to use with the data stack ...
But if the stack is not being used for the return stack or for the IP, it can be used for the data stack.
Code:
AND: PLA
AND 1,S
STA 1,S
INX
INX
JMP ($0000,X)
SWAP: PLA
PLY
PHA
PHY
INX
INX
JMP ($0000,X)
PLUS: PLA
CLC
ADC 1,S
STA 1,S
INX
INX
JMP ($0000,X)
FETCH: PLY
LDA $0000,Y
PHA
INX
INX
JMP ($0000,X)
STORE: PLY
PLA
STA $0000,Y
INX
INX
JMP ($0000,X)
In the 65C02 version, the "JMP ($0000,X)" has to be self-modifying code in the zero page, so saving the three clocks of "JMP NEXT" is not an option ... in the 65C02 version, the IP IS the spot in the zero page occupied by the $0000 ... and so using the hardware stack as the data stack while allowing most operations to avoid "STX TX : TSX : ... : LDX TX" benefits from a top-of-stack location in the zero page.
With the stack relative addressing in the 65816, the hardware stack makes a fine data stack on its own.
Of course, there is no such thing as a free lunch ... saving the three clocks of the "JMP NEXT" or "BRA NEXT" also means each primitive ends with a five byte NEXT rather than a two byte branch or three byte jump to NEXT.