I've been reviewing the following set of posts on VM support
http://forum.6502.org/viewtopic.php?f=10&t=3046It has been mentioned on the forum that it would nice if the IP (interpretive pointer) register were an internal cpu register.
Even better if it auto-increments after fetching from memory.
I put some thought into supporting the IP register as an internal core register. And came to the conclusion that the way to do it was to use the existing program counter (at least in a multi-context core). The program counter both fetches from memory and auto-increments. Even better it uses the instruction cache to cache interpreter code.
Hence I came up with the following.
The following requires multiple register contexts in the processor. It ping-pongs between two register set contexts. One is a special interpretive task that fetches the code; the other is an interpreter that interprets the code fetches.
A special core operating mode called interpretive mode is used to assist the implementation of interpreters. It can be turned on by setting bit 2 in the extended status register. An interpretive mode task uses the PC itself to fetch opcodes into the accumulator register rather than the instruction register.
The PC of the interpretive task takes on the function of the IP register in Forth parlance. In the IFETCH stage of the core for an interpretive task the core then performs a task switch back to the invoking task. The accumulator is passed back to the interpreter where it can be processed. The program counter of the interpretive task is passed back to the invoker in the .X register.
Benefits of using a task are: the bytecode can reside in a different code or data segment than the interpreter itself. It is relatively fast as an internal registers are being used rather than a variable in memory. Fetchs are governed by the PC and hence from the code segment, meaning the instruction cache is used to enhance performance.
The mechanism here allows the interpreter to be written in 32 bit code while at the same time allowing a bytecode implementation.
Task # 10 is the special interpretive task in this case. The program for task #10 is perhaps a bytecode.
; A potential Forth NEXT routine.
Code:
NEXT:
; switch to interpretive task to fetch word at interpretive PC
; switching to the interpretive task, then returning takes 6 clocks
TSK #10
NEXT2:
STA W
LDA {W}
; A second variable is used to allow a double indirect jump
; rather than using self-modifying code. As self-modifying code
; would require a cache line invalidate that takes 20 cycles
STA W2
JMP {W2}
; How to perform a branch
Code:
TSK #10 ; get the branch displacement
AAX ; add .A and .X
ORA #$0A000000 ; select context register #10 (in high eight bits of acc).
JCI ; indirect jump to the context to set PC (JCI [acc])
BRA NEXT2 ; the JCI context switch will cause the next instruction fetch
; so we can skip over the TSK #10
; Bytecode interpreter, dispatch
Code:
NEXT:
TSK #10 ; fetch a byte
NEXT2:
ASL
TAX
JMP (JmpTable,X)