Martin_H wrote:
@BillG, thanks for the pointer. I skimmed it and a 15 cycle NEXT function is an impressive speed boost! Mine is much worse than that due to the self imposed constraints of my programming challenge.
As an easy optimization I put NEXT into a macro and appended it at the end of each function, rather than jumping to NEXT. That eliminated 3 cycles per NEXT and resulted in a very modest improvement (at the cost of code density). It probably wasn't worth it compared to something like relocating it to page zero.
Also note that fig-Forth was an implementation that was very early on in the life of the Forth programing language, and there was more experience in implementing Forth's reflected in Forth implementations being done later in the 80's, so many people will refer to implementations like Bill Muench's eForth,
http://www.forth.org/eforth.html, also worked on by Dr. CH Ting and others. eForth is neither speed optimized nor code size optimized: it is system development speed optimized, based on a relatively small number of code words with the rest of the system written in Forth. The original (and many later) eForth implementation was distributed as assembler source, similar to fig-Forth, while others are written to be cross-compiled using an already running Forth implementation.
If NEXT is NOT in the ZP, it may make sense to "write in" NEXT for JUST the most frequently used words ... IMV, DROP DUP PLUS FETCH and STORE would be a reasonable target list for a lot of Forth applications.
(1) If NEXT is not in the Dictionary, and the code for NEXT is NOT at the end of the EXIT routine (for instance, because there is an optimization for incrementing the IP as part of the EXIT process),
it is a common practice to write DROP with the NEXT code at the end and use the address of that NEXT for the JMP NEXT for other words ... once the system is up and running and you have some working Source to go on, doing a frequency list of Forth words in Source you may be using will quickly suggest some other words where you might want to include the NEXT routine rather than a JMP NEXT.
(2) Even if NEXT is in the Dictionary in the form of NOP, DROP is used a LOT more than NOP, so doing (1) and then just having "JMP NEXT" for the NOP code would be a simple speed optimization with very little code space impact.
Quote:
But rather than pursue that I plan to rewrite into a subroutine threaded version to see how using the underlying hardware as the NEXT function effects performance.
Prediction: SRT (SubRoutine Threading) will be a MASSIVE boost compared to indirect threading and a SUBSTANTIAL boost compared to direct threading.
Also, a word of advice from an earlier attempt to implement a Forth that got bogged down: there are a LOT of optimizations for both speed and code density you can pull off with subroutine threading, but it makes a lot of sense to just ignore them up front and get the subroutine threaded code system working, see how performance goes compared to what you would like, and then look at building in the optimizations in order of what looks like the best "bang for the buck" in terms of payoff versus effort required. By the time you have finished making a straightforward SRT system, you will have a MUCH better feel for how much effort will be required by the various possible optimizations.