pdragon wrote:
Another possibility would be to hijack BRK [...]. But that seems extreme. other suggestions welcome!
If you're soliciting extreme suggestions then I am game for the challenge!
What I'll describe is a novel encoding for
CLIT (and it's also adaptable for
LIT). It's very small and very fast, but there's a unusual requirement which, depending on circumstances, may or may not be deemed acceptable.
Usage is like this...
Code:
jsr cliteral
.byte $3
...
... and cliteral looks like this.
Code:
cliteral:
pla
sta n ;n and n+1 are a pair of scratch bytes in z-page
pla
sta n+1 ;the z-pg pair at n points to the third byte of the JSR instruction
dex ;make space for a 16-bit value
dex
stz 1,x ;msb of the value is zero
ldy# 1
lda (n),y ;lsb of the value is found 1 past the third byte of the JSR instruction
sta 0,x
jmp (n) ;sic
So, yeah... There's no RTS!
Instead, execution jumps to the last byte of the JSR instruction, which of course is its ADH operand... except now it gets digested
as an opcode.
And most of you already know this next trick. Certain two-byte instructions opcodes such as CMP immediate -- whose opcode is $C9 -- can be used in place of a BRA when the goal is to skip forward by one byte only. In this case, and using $C9 as an example, the last byte of the JSR instruction is arranged to be $C9, and the CPU parses the $C9 and the one-byte inline parameter as a 2-byte, CMP# instruction. To us, the CMP functions as a NOP. Its only purpose it is to harmlessly guide the CPU to the code that
follows the 2-byte CMP# instruction. This all works wonderfully smoothly, but
we need cliteral to be mapped to an address whose high-byte is equal to a suitable opcode.
Luckily there are plenty of candidates! CMP#, CPX#, CPY# and BIT# are acceptable because Tali won't mind if the flags get altered. And AFAIK Tali can also tolerate LDY#; likewise LDA#, ADC#, SBC#, AND#, ORA# and EOR#.
And, for 65C02, the opcodes $02, $22, $42, $62, $82, $C2 and $E2 are also acceptable, because they are all 2-byte, 2-cycle NOPs. (The 65C02 has other intriguing potential in regard to NOPs; I wrote about that subject
here.)
None of the 2-byte instructions mentioned so far will access memory (other than the 2 cycles accessing memory at PC). But at the cost of one or more extra cycles, a bunch of additional candidate 2-byte instructions become available, such as BIT z-pg, LDA z-pg,x and CMP (ind),y. Just be aware that the extra memory accesses can be dangerous if there's I/O in the system whose state may get altered by an unexpected read.
Finally, opcodes for certain
three-byte instructions can be used in place of a BRA when the goal is to skip forward by
two bytes. This means the inline value after the JSR could be increased to 16-bit (as required by
LIT). And, as above, there's a read that occurs which adds delay and could touch I/O.
-- Jeff