GARTHWILSON wrote:
As for register widths, my '816 Forth leaves the accumulator in 16-bit mode and the index registers in 8-bit mode almost full time. There are very few places in the code where it gets changed; and after a few instructions, it gets put right back. Those REP and SEP instructions are terribly cryptic and those two by themselves probably scare a lot of people away.
REP and SEP may be cryptic (at least until you get comfortable with them), but they open the door to some interesting programming, as they may be used to clear (REP) or set (SEP) any desired combination of status register bits. There's no such instruction as SEV (set overflow bit), but you can write a macro that does the equivalent:
Code:
sev .macro ;set overflow bit in SR
sep #%01000000
.endm
Lots of flexibility with REP and SEP.
Quote:
For 8- versus 16-bit A and index registers, it only makes sense to make macros that are far more clear than REP and SEP. I called mine ACCUM_8, ACCUM_16, INDEX_8, and INDEX_16, which are far more clear and assemble the appropriate REP or SEP instruction. I think BDD called his LONG_A, etc.. I use the Cross-32 (C32) assembler, and instead of having to tell it LONGI ON etc., it just gives 16-bit unless you preface the operand with "<", like LDX #<$20 which lays down A2 20, not A2 20 00.
LONGA = SEP #%00100000 = 16 bit accumulator
LONGX = SEP #%00010000 = 16 bit .X and .Y
LONGR = SEP #%00110000 = 16 bit everything
SHORTA = REP #%00100000 = 8 bit accumulator
SHORTX = REP #%00010000 = 8 bit .X and .Y
SHORTR = REP #%00110000 = 8 bit everything
Something to keep in mind is that 16 bit loads and stores occur "serially," meaning that memory access is a byte at a time without regard to the settings of the
m and
x bits in the status register. The '816 uses an extra clock cycle to read/write the most significant byte, something that has to be kept in mind in the 8 vs. 16 bit debate. Also, indexing in which .X and .Y are set to 16 bits incurs a one clock cycle penalty. Combine that with a 16 bit load/store operation and LDA SOMEWHERE,X becomes two cycles more expensive than it would if all registers are set to eight bits.
You have to evaluate execution speed against other program factors in deciding whether to process 8 or 16 bits at a time. In most cases, the extra clock cycles involved with 16 bit loads and stores and 16 bit indexing are more than offset by the smaller and generally faster code that results.