barrym95838 wrote:
Just to provide a bit of context, you're coding for an '802/816, E|M|X=0, ITC, TOS in A, right?
TAY is an efficient way to test TOS, but how often do its two cycles get "wasted" by putting it in NEXT instead of inside every word that uses it? It's easy to see that you'll save a small amount of space and waste a small number of cycles, and I usually default toward saving space unless instructed otherwise, so I'm on board.
I am only half done and already am at the break-even point where cycles saved equals cycles used by the TAY. There are times when doing this has saved 10 cycles within some primitive definitions. How often are these comparisons used I just listed? Quite a bit. Just the use of one of these comparisons in a word definition equals the 2 extra cycles for each word in a 10 word definition.
Meaning if a word definition has 10 words, the use of the TAY would use 20 cycles. Just one use of any comparison in a word definition will save 20 cycles, and we are already at the break-even point for using the TAY, with just one word. If any other words in the definition also save a few cycles, then we are way ahead.
The second advantage of using the TAY is, I have already saved over a page of code that I would have needed by preserving or reloading the Accumulator first in some of the primitives, or by needing a branch flag as in the comparisons listed above.
The answer is, YES, it saves a lot of cycles to use the TAY at a cost of 2 cycles per each word compiled, as well as greatly reduces the code size. Win-Win.
I should re-write this to say:
There are fewer cycles wasted than are gained in each of the word definitions that are called. You have to remember that the NEXT routine is not just used in compiling, it is also used for the run-time. You may only need to compile once, but you may end up running some routines over and over again, before re-compiling.
Therefore the 2 cycles lost at compile-time doesn't really add up, because they are only wasted once.
Then, just concentrating on the 2 cycles sometimes wasted at run-time, the alternative would be to put the TAY in each of the word definitions that require it.
Most of the words that require it are the ones that are more often used, so the percentage of waste drops dramatically. Therefore it comes down to about a 10-15% increase in speed by having the TAY in each word definition that requires it compared to saving 1-byte in each of the words that require it. So far, about half of the words would require it. So a forth having 250 words would save 125 bytes.
But wait there is more. 125 bytes is nothing to sneeze at as those byte savings can also mean the difference of being able to use BRA NEXT instead of JMP NEXT. One can fit a lot of code in the 127 bytes within a branch range.
It is extremely hard to calculate the savings when a lot of words can be affected indirectly as well. But the fact is that 1-byte in NEXT allows one to pack a lot of code within range to allow for branching.