Hi!
When writing my FastBasic VM, I discovered that the generated code was much faster and smaller by using a register for the TOP-OF-STACK (currently the A and X registers hold the 16-bit TOS), this allows to generate code like:
Code:
LDA #12
JSR PUSH_BYTE ; push #12 to the stack
LDA #2
LDX #1
JSR ADD ; Adds 258 ($102) to 12, result is 270 in A/X
In FastBasic the priority is code size over speed, with an indirectly threaded interpreter, so in reality the above is bytecode only: "LOAD_BYTE", #12, "PUSH", "LOAD_WORD", #2, #1, "ADD", 7 bytes only for the code, but I toyed with a compiler that produced the code above.
To allow 8/16 bit values, PUSH_BYTE is "LDX #0" followed by the PUSH_WORD code, same "ADD_BYTE" is "LDX #0" followed by "ADD" code, etc.
Have Fun!