I wrote that last post a little too fast, didn't I? The registers should be restored in the order Y, X, A, not A, X, Y and a REP #$30 should precede the PLY, PLX, and PLA instructions. In the interests of getting things right, here are a couple of facts to keep in mind when saving registers on the 65816 when switching between 8 and 16 bits or when switching between emulation and native mode.
Fact #1: Setting the x FLAG to 1 (8-bit index registers) overwrites the high bytes of the X and Y registers.
Setting the x flag to 1 with, for example, a SEP #$10 or SEP #$30 instruction, will set the high bytes (bits 15-8) of the X and Y registers to $00. This means that a PHX and/or PHY must come BEFORE the SEP instruction! For example, after the following:
Code:
CLC
XCE
PHP
REP #$10 ; 16-bit index registers
LDX #$1234
LDY #$5678
SEP #$10 ; 8-bit index registers
REP #$10 ; 16-bit index registers
STX XREG
STY YREG
PLP
XCE
XREG contains $34
XREG+1 contains $00
YREG contains $78
YREG+1 contains $00
Fact #2: Setting the e FLAG to 1 (emulation mode) sets the m and x flags to 1 and overwrites the high byte of the stack pointer (S register).
Note that since the x flag is set to 1, the high bytes of the of the X and Y registers will be overwritten! The high bytes of the X and Y register get set to $00, and the high byte of the stack pointer gets set to $01. Because the high byte of the stack pointer will be overwritten, caution should be used when calling emulation mode routines from native mode! The code above can be modified to illustrate that the high bytes of the S, X and Y registers get overwritten. After:
Code:
CLC
XCE
PHP
REP #$30 ; 16-bit mode
TSC ; save stack pointer in accumulator
LDX #$ABCD
TXS
LDX #$1234
LDY #$5678
SEC
XCE
CLC
XCE
REP #$30 ; 16-bit mode
STX XREG
STY YREG
TSX
STX STKPTR
TCS ; restore stack pointer
PLP
XCE
STKPTR contains $CD
STKPTR+1 contains $01
XREG contains $34
XREG+1 contains $00
YREG contains $78
YREG+1 contains $00
The following code illustrates that the m and x get set to 1.
Code:
CLC
XCE
PHP
REP #$30 ; 16-bit mode
SEC
XCE
CLC
XCE
PHP ; push m and x flag values onto stack
SEP #$20 ; 8-bit accumulator
PLA ; pull m and x flag values from stack
AND #$30 ; mask m and x flags
STA MXFLGS
PLP
XCE
After the code above, MXFLAGS will contain $30.
To save and restore the A, X, and Y registers while preserving the m, x and e flags, start with:
Code:
PHX
PHY
CLC ; * see below
XCE ; * see below
PHP
REP #$30 ; ** see below
PHA
SEP #$30
and exit with:
Code:
REP #$30 ; ** see below
PLA
PLP
XCE ; * see below
PLY
PLX
To return with the carry flag clear or set, exit with:
Code:
REP #$30 ; ** see below
PLA
BCS LABEL
PLP
XCE ; * see below
PLY
PLX
CLC ; not necessary if the preceding XCE is used
RTS ; or RTL
LABEL PLP
XCE ; * see below
PLY
PLX
SEC
RTS ; or RTL
By saving X and Y first (and restoring them last) you can save a couple of bytes of stack space if the routine is called when the x flag is already 1 (8-bit index registers). Of course, you won't need to save A, X, or Y if your routine doesn't overwrite them, but it might be a good idea leave those instructions in place until your routine is fully debugged.
* These instructions can be eliminated if the routine will always be called from native mode (the most common case). This will save 3 bytes and 8 cycles. However, you may wish to leave these instructions in place while debugging or if you are using both emulation mode routines and native mode routines.
** These instructions can be eliminated if the high byte of the accumulator isn't overwritten (it would be overwritten if you used, for example, a XBA instruction). This will save 4 bytes and 6 cycles. Note that TDC and TSC ALWAYS overwrite all 16-bits of the accumulator (including both in native mode when the m and x flags are 1 and in emulation mode)!
In fact, if you don't overwrite the high byte of the accumulator, you can move the PHA before the PHP and the PLA after the PLP. This would allow you use ROL and LSR to return with the carry flag clear or set. In this case, you would enter with:
Code:
PHA
PHX
PHY
CLC ; * see above
XCE ; * see above
PHP
SEP #$30
and exit with:
Code:
ROL ; save carry flag
PLP
XCE ; * see above
LSR ; restore carry flag
PLY
PLX
PLA
Note that using ROR and ASL won't necessarily work, since the accumulator may be a different width before the PLP than after. (For example, if the m flag is 1 before the PLP, the accumulator will be 8 bits wide, and if the m flag is 0 after the PLP the accumulator will be 16 bits wide.)