Okay, let me try to be a bit more systematic about the whole thing. The way I see it, the Data Stack can have four states, two of them legal, two of them errors:
- One or more entries: legal
(I think part of my problem so far was to consider "underflow" and "empty" to be a single case.)
A word can rightfully expect to start off with the Data Stack in a legal state, that is, either empty or with one or more entries. In fact, it doesn't need to check for
over- or underflow at all, because those are fatal errors and we crash to ABORT anyway. As stated above, we can move the check for these outright error conditions to the interpreting loop
after the word is completed, which saves space.
It's the legal (!)
empty status that can be the real problem. Some words like 1+ only work if there is a non-empty stack; SWAP needs at least two entries, ROT even three. We can either accept those errors silently (which is fastest) or check in the word itself (which takes longer). After some thought, I think Liara Forth should be sophisticated enough to catch them, so I'm modifying the "speed first" goal by an old journalism rule:
Be first, but first be right!
So how do we do that? Using X as the Data Stack Pointer (DSP) pretty much seems the way to go because of the addressing modes on the Direct Page. With the help of wrapping, we can use the MSB of a byte to signal
over- or underflow and have 128 bytes of space, which gives us a stack depth of 64 16-bit cells: Simply check the n flag of X after the word returns. We can even do this with a 16 bit X register with something like:
Code: Select all
txa
and.# 0080 ; (AND #$0080) mask all but over/underflow bit
bne stack_out_of_bounds
(Note that by playing with the operand of the AND instruction, we can even adjust the legal depth of the Data Stack). AND is three cycles, which is an acceptable price for being able to keep X and Y 16 bit wide.
How to we signal an
empty stack? The first instinct would be to use $0000 in X. However, we want something that triggers underflow for (say) DROP automatically, and INX INX just gives us $0002. So we should go with $007E as the "empty" value: INX INX produces $0080. For a word that requires one entry on the stack, we now include in its code the check
Code: Select all
cpx.# 007e ; (CPX #$007E)
beq stack_empty
which costs us five cycles if the branch is not taken ("... but first be right").
With $007E in X signaling an empty stack, we shouldn't have to care if TOS is an register such as A or Y or part of the DS (I think). With Y as TOS, for 1+ we get
Code: Select all
cpx.# 007e
beq error_stack_empty
iny
rts
and with TOS on DP we get
Code: Select all
cpx.# 007e
beq error_stack_empty
inc.dx 00
rts
though we really don't want to do that because INC $00,X is brutal eight (!) cycles, while INY is only two. So TOS would probably be Y, not A, because we need A for the ANDing etc.
At least that's the idea at the moment. I'm going to have to do some walkthroughs with pencil and paper first before I trust this, though.