barrym95838 wrote:
That's Charlie's deal with his Pettil implementation. It involves some trade-offs. A favorable example: NIP becomes INX. There are pros and cons to any decision like that. For my 65m32 DTC Forth, I keep TOS in the accumulator, but that would be impossible for a 6502, and a dubious optimization for an '802/'816. Perhaps Charlie did some benchmarks that led to his design decision, and would be able to share them briefly with us, here or in a fresh thread?
Looking at this again, another advantage of split stack with separate TOS is that all Forth words referencing TOS will save a few clock cycles by virtue of zp (and not zp,X) addressing mode, e.g. `+`
Code:
plus ; split stack with separate tos
clc ;[2]
lda tos ;[3]
adc stackl,x ;[4]
sta tos ;[3]
lda tos+1 ;[3]
adc stackh,x ;[4]
sta tos+1 ;[3]
inx ;[2]
jmp next ;[3] =27 clocks
vs.
plus ; split stack only
clc ;[2]
lda stackl,x ;[4]
adc stackl+1,x ;[4]
sta stackl+1,x ;[4]
lda stackh+1,x ;[4]
adc stackh,x ;[4]
sta stackh+1,x ;[4]
inx ;[2]
jmp next ;[3] =31 clocks