It slipped my mind that BACK is a turtle graphics command so it isn't a good choice for a name. For now, I'm going with RERUN and ?PASS .
The actions of these two words can, at times, be duplicated by other means. Sometimes, those means aren't pretty. Here is an example of how these words can make the source clearer.
Back before my Forth's current version of INTERPRET , I noticed in some cases INTERPRET had a BRANCH to other BRANCHes before finally branching to the beginning. I tried to rewrite the interpreter loop to eliminate the extra trips through NEXT . It looked something like this:
: INTERPRET ( -- )
BEGIN
BEGIN
BEGIN
BEGIN
BEGIN
BEGIN
PAUSE
NAME FIND ?DUP
WHILE
STATE @ =
WHILE
,
REPEAT CS-SWAP
EXECUTE ?STACK
REPEAT
NUMBER? ?HUH ?STACK
DPL @ 0<
WHILE
DROP STATE @
UNTIL CS-SWAP
[COMPILE] LITERAL
REPEAT
STATE @
UNTIL
SWAP
[COMPILE] LITERAL
[COMPILE] LITERAL
AGAIN ; -2 ALLOT
This wasn't pretty and I didn't like it. If I'd thought of them, RERUN and ?PASS , would have made this easier. This is a cleaner version which compiles to exactly the same thing.
: INTERPRET ( -- )
PAUSE
NAME FIND ?DUP
IF
STATE @ =
IF , RERUN THEN
EXECUTE ?STACK RERUN
THEN
NUMBER? ?HUH ?STACK
DPL @ 0<
IF
DROP STATE @ ?PASS
[COMPILE] LITERAL RERUN
THEN
STATE @ ?PASS
SWAP
[COMPILE] LITERAL
[COMPILE] LITERAL
RERUN ; -2 ALLOT
: INTERPRET ( -- )
BEGIN
PAUSE
NAME FIND ?DUP
IF
STATE @ =
IF , RERUN THEN
EXECUTE ?STACK RERUN
THEN
NUMBER? ?HUH ?STACK
DPL @ 0<
IF
DROP STATE @ ?PASS
[COMPILE] LITERAL RERUN
THEN
STATE @ ?PASS
SWAP
[COMPILE] LITERAL
[COMPILE] LITERAL
AGAIN ; -2 ALLOT
As can be seen in these examples, the results of RERUN and ?PASS can, at times, be difficult to replicate with other control flow words.
Here is a smaller version of RERUN and ?PASS .
Newb question, Jim. I have seen IMMEDIATE directly after a ; but not -2 ALLOT or anything else in between. Could you enlighten me with a brief description of how and why your compiler handles this situation, if it's not too much off-topic?
Got a kilobyte lying fallow in your 65xx's memory map? Sprinkle some VTL02C on it and see how it grows on you!
Newb question, Jim. I have seen IMMEDIATE directly after a ; but not -2 ALLOT or anything else in between. Could you enlighten me with a brief description of how and why your compiler handles this situation, if it's not too much off-topic?
-2 ALLOT just removes the unnest (compiled by the semicolon) which is unnecessary if it'll never get executed because it is preceded by something like a branch. The semicolon does a couple of other things too at compile time though, which is why we don't just replace it with a [.
Okay, I got the feeling he was trimming something off the end, but I wasn't sure why. Two bytes saved is two bytes earned! So, the ; does a bit more dictionary housekeeping than the [ (which just zeroes STATE )?
Got a kilobyte lying fallow in your 65xx's memory map? Sprinkle some VTL02C on it and see how it grows on you!
Yes, -2 ALLOT trims two bytes off the end because the EXIT will never be reached.
Blazin' Forth had ?CSP . It works by checking that the stack depth is the same as that saved by colon.
My Forth works a little differently. Colon, and CODE push the address to unsmudge and TRUE on the control flow stack (in my case, the data stack) .
Semicolon, and END-CODE check to see if the top stack item is true and use the address to unsmudge the name. I went with this approach because I was already thinking of :NONAME or something like it. In the case where there is no name, semicolon is passed a dummy address which can be harmlessly manipulated.
Fleet Forth's semicolon.
There is a name I've seen elsewhere I could add to my kernel, -; (dash-semicolon). It would do everything semicolon does except compile EXIT , the Forth-83 Standard name for unnest. Semicolon would compile EXIT and call dash-semicolon.
Dash-semicolon doesn't compile anything so ?COMP is there to make sure state is compiling.
I had thought about adding dash-semicolon to clarify things and make this technique portable, but I thought I would just be giving back the bytes I saved. The kernel size would increase by thirteen bytes; however, I just checked and I use this technique twenty five times just in the kernel.
[Edit: corrected a typo.]
Last edited by JimBoyd on Wed Oct 19, 2022 11:21 pm, edited 1 time in total.
<TEST0>
IF <ACTION0> EXIT THEN
<TEST1>
IF <ACTION1> EXIT THEN
<TEST2>
IF <ACTION2> EXIT THEN
<TEST3>
IF <ACTION3> THEN ;
EXIT is the Forth-83 Standard word to unnest a high level word.
If the series of tests and actions are in the middle of a word, I prefer to factor the test out as a separate word.
In some situations, a CASE structure is appropriate; and with of as a primitive, this structure becomes a lot more efficient than a series of IF...ELSE...IF...ELSEs.
Otherwise, I often like to arrange the nested IFs as:
CONDITIONS IF
ACTIONS ACTIONS
ACTIONS ACTIONS ELSE
CONDITIONS IF
ACTIONS ACTIONS
ACTIONS ACTIONS ELSE
CONDITIONS IF
ACTIONS ACTIONS
ACTIONS ACTIONS ELSE
ACTIONS THEN THEN THEN
(and what's right before one or more of the ELSEs might be an EXIT). Of course what's most readable will depend on the situation.
I agree that the appropriate method depends on the situation. A trade off between size and efficiency is also a factor. EXIT and the conditional exits, ?EXIT and 0EXIT are also useful outside the context of multiple nested IF ELSE THEN structures. Here is a headerless word used by Fleet Forth's T&S41 , a word to map blocks to the initial track and sector for the 1541 and 1571 drives as well as their dual disk counterparts.
NH
: (T&S41) ( BLK# -- BLK# S/T )
21 OVER 88 < ?EXIT
2- OVER 120 < ?EXIT
1- OVER 146 < ?EXIT
1- ;
These drives do not have the same number of sectors for each track. This small word takes a block number and returns the block number as well as the number of sectors per track for the starting sector of that block.
Writing this fragment without ?EXIT and including it as part of the word T&S41 would not have been as clean.
This post concerns a technique used on Forth systems with indirect threaded code (ITC). It may work with other threading models.
Although the example from M. L. Gassanenko's paper "Dynamically Structured Codes" wasn't really self modifying code, it certainly is not your run of the mill control flow.
Just as EXECUTE takes the address of a word's Code Field and launches that word, ENTER takes the address of a fragment of threaded code and 'launches' that fragment. The address must align on what Gassanenko calls an active 'threaded code element', or a TCE. An active TCE is an address which holds the CFA of a Forth word. The 'threaded code fragment' must also exit (or branch somewhere which exits or aborts). Consider the word ?COMP . In my Forth it is this:
?COMP
7999 2130 STATE
8001 3561 @
8003 4602 0=
8005 7822 (ABORT") FOR COMPILING
8021 2970 EXIT
The body of this word, or some portion thereof, is a threaded code fragment. All the addresses on the left are an address in the threaded code fragment with an active TCE. Another way to obtain the address of a threaded code fragment which points to an active TCE is with R@ .
Here is an example from Gassanenko's paper. The example requires the data stack to initially be empty.
: ENTER >R ;
: EL R@ ENTER DROP ;
: .<[]>
CR ." <[ " DEPTH 0
?DO
I PICK COUNT TYPE SPACE
LOOP
." ]> " ;
: SUBSETS
" FIRST" EL " SECOND" EL
" THIRD" EL .<[]> ;
: ENTER >R ;
0 VALUE ODEPTH
: EL R@ ENTER DROP ;
: .<[]>
CR ." <[ " DEPTH ODEPTH - 0
?DO
I PICK 4 U.R SPACE
RP@ $17F < ABORT" RS LIMIT"
LOOP
." ]> " ;
: SUBSETS ( -- ) DEPTH TO ODEPTH
1 EL 2 EL 3 EL 4 EL 5 EL 6 EL
7 EL
.<[]> ;
The value ODEPTH is used so the data stack does not need to be empty. Here is the test run.
Here is an example of mine to show what is happening. FORK leaves a flag on the data stack, TRUE if this is the first time this fragment runs or FALSE if it is the second.
Here is another example of mine. It uses the word FORK from the previous example. The word GATE1 represents a logic gate, or circuit, with six inputs and one output. The first two inputs are exclusively ORed together, as are the third and forth, as well as the fifth and sixth. The three results are ORed together to derive the output. GATE1 takes either one parameter or three. If the top of stack is true, GATE1 leaves its output on the stack. If the top of stack is false, the next two parameters are used to set one of the inputs.
On a sixteen bit Forth, ENTER has a body of four bytes and a code field of two bytes. The header is eight bytes, with full length names, for a total size of fourteen bytes.
I think this word has potential. Given its small size I'm inclined to include it in my Forth's system loader, unless I see a clever way to use it in the Forth kernel.
Here are more interesting uses of ENTER .
My Forth has a tree like vocabulary structure with the FORTH vocabulary as the root. A vocabulary, other than FORTH , can be forgotten, like most words. one of the words to help manage the vocabularies is VOCS . It shows the vocabulary tree structure. Here is what VOCS shows when the metacompiler is loaded:
VOCS
FORTH
META
SHADOW
FORTH
ASSEMBLER
EDITOR
ASSEMBLER OK
Yes, there are two vocabularies named FORTH and two named ASSEMBLER . VOCS places the PFA of FORTH and a zero (for the initial indentation level) on the stack. It would then call a nameless recursive word.
: VOCS ( -- )
[ ' FORTH >BODY ] LITERAL 0
[ <MARK ]
?STACK
DUP>R CR SPACES
DUP BODY> >NAME ID. VOC-LINK @
BEGIN
2DUP 2- @ =
IF
DUP 2- 2- R@ 2+
LIT CS-ROT [ <RESOLVE ] ENTER
THEN
@ ?DUP 0=
UNTIL
R> 2DROP ;
LIT is the primitive compiled by LITERAL . Note that <MARK and <RESOLVE are used to compile an address for LIT which is used by ENTER .
Here is the dis-assembly.
ENTER does not cause a branch back to the address 31275, where ?STACK is located. This use of ENTER is like recursion but without the need to start at the beginning of a word.
Here is another example. ORDER shows the CONTEXT vocabulary and its parents. This will be the search order. Since there can be multiple vocabularies with the same name (again, metacompiling) it also shows the CURRENT vocabulary and its parents. The following example is when the metacompiler is loaded. The top FORTH vocabulary is the target FORTH vocabulary.
[BEGIN] and [UNTIL] revisited and a new look at loading multiple blocks.
The versions of [BEGIN] and [UNTIL] presented here will only work on a single screen of Forth source in a block based system. That single screen can have a LOAD or THRU between [BEGIN] and [UNTIL] ; however, that could be somewhat inconvenient.
Here are versions which can span multiple blocks.
300 RAM 302 RAM THRU 16684
NUMBER ON THE STACK: 10 16685 16686
NUMBER ON THE STACK: 9 OK
THRU displays the number of each block it is about to load: 16684, 16685 and 16686. Notice that the [BEGIN][UNTIL] structure almost worked. When block 16686 (ram block 302) was loaded, [UNTIL] worked perfectly and caused ram block 300 to load just after [BEGIN] instead of the rest of ram block 302; however, THRU exits because ram block 302 was the last one to load. Even if ram block 302 wasn't the last one to load, THRU would have proceeded to load ram block 303 instead of ram block 301 as desired. This THRU is implemented with a DO LOOP and uses the loop index to determine the number of the next block to load.
Here is a word from the uncontrolled reference words in the Forth-83 Standard which will make a multi block spanning [BEGIN][UNTIL] structure possible.
--> causes the next higher block to load, but it is incompatible with THRU , at least this THRU .
Modifying the source slightly by appending --> to the end of each block will allow [BEGIN] and [UNTIL] to work across multiple blocks, but the first block must be LOADed. THRU can not be used!
200 RAM LOAD
NUMBER ON THE STACK: 10
NUMBER ON THE STACK: 9
NUMBER ON THE STACK: 8
NUMBER ON THE STACK: 7
NUMBER ON THE STACK: 6
NUMBER ON THE STACK: 5
NUMBER ON THE STACK: 4
NUMBER ON THE STACK: 3
NUMBER ON THE STACK: 2
NUMBER ON THE STACK: 1
FINISHED OK
I normally don't use --> and didn't define it until just moments ago. I'm not fond of the need to be extra cautious to not mix --> and THRU .
I can use a feature of my system to rewrite THRU .
My Forth's WORD saves the history of the text stream (the values of BLK and >IN) for use with WHERE , one of the error handling words.
: THRU ( LO HI -- )
>R 1- HISTORY !
BEGIN
5 ?CR
HISTORY @ 1+
DUP U. LOAD
HISTORY @ R@ U< 0=
DONE? OR
UNTIL
R> DROP ;
Under normal circumstances this THRU does exactly what the original does.
The last block to load is pushed to the return stack. The number of the first block is decremented and stored in the first cell of HISTORY . The line with ?CR is just for 'pretty printing' and can be ignored in this explanation. The value of HISTORY is fetched and incremented by one. This will be the first block to load the first time through the loop. Once the block loads, the first cell of HISTORY contains the number of the block which just finished loading, even if that block is the last in a chain of blocks linked by --> . This block number is compared to the number saved on the return stack. The loop keeps going as long as the block just loaded is less than the last one to be loaded.
This version of THRU can safely be used with --> . Indeed, they use a similar mechanism to advance to the next block. BLK contains the number of a block while it's loading. The first cell of HISTORY contains the number of the block which has just finished loading. More accurately, HISTORY holds the location of the last text string parsed by WORD wherever that string was.
Used in a block like this:
The number of the block currently loading will be displayed.
Here are some blocks of source where --> is only used in one block and THRU is used to load the range of blocks.
200 RAM LIST
SCR# 16584 8 200
0:
1:
2:
3:
4:
5:
6:
7: [BEGIN]
8: CR .( NUMBER ON THE STACK: )
9: DUP .
A:
B: .( BLK# R200) CR
C:
D:
E:
F: -->
OK
0 FH 2 FH THRU 16584
NUMBER ON THE STACK: 9 BLK# R200
BLK# R201
16586
BLK# R202
NUMBER ON THE STACK: 8 BLK# R200
BLK# R201
16586
BLK# R202
NUMBER ON THE STACK: 7 BLK# R200
BLK# R201
16586
BLK# R202
NUMBER ON THE STACK: 6 BLK# R200
BLK# R201
16586
BLK# R202
NUMBER ON THE STACK: 5 BLK# R200
BLK# R201
16586
BLK# R202
NUMBER ON THE STACK: 4 BLK# R200
BLK# R201
16586
BLK# R202
NUMBER ON THE STACK: 3 BLK# R200
BLK# R201
16586
BLK# R202
NUMBER ON THE STACK: 2 BLK# R200
BLK# R201
16586
BLK# R202
NUMBER ON THE STACK: 1 BLK# R200
BLK# R201
16586
BLK# R202
FINISHED OK
And success! --> was successfully used within a range of blocks loaded by THRU . All it did was advance the loading in place of the mechanism used by this new THRU . Since this new THRU gets the number of the latest block loaded from HISTORY , --> will not cause it to unintentionally load blocks multiple times.
SCR# 200
[BEGIN]
CR .( NUMBER ON THE STACK: )
DUP .
.( BLK# R200) CR
-->
SCR# 201
1-
CR .( BLK# R201) CR
SCR# 202
CR .( BLK# R202) CR
?DUP 0=
[UNTIL]
CR .( FINISHED)
-->
SCR# 203
CR .( LET'S PULL A FAST ONE ON THRU)
CR .( AND SEE IF IT RUNS WILD!!!)
CR .( RAM BLK # 203)
-->
SCR# 204
CR .( LET'S PULL A FAST ONE ON THRU)
CR .( AND SEE IF IT RUNS WILD!!!)
CR .( RAM BLK #204)
ram blocks 202, 203 and 204 have been chained together with --> .
Here is the session log.
200 RAM 202 RAM THRU 16584
NUMBER ON THE STACK: 10 BLK# R200
BLK# R201
16586
BLK# R202
NUMBER ON THE STACK: 9 BLK# R200
BLK# R201
16586
BLK# R202
NUMBER ON THE STACK: 8 BLK# R200
BLK# R201
16586
BLK# R202
NUMBER ON THE STACK: 7 BLK# R200
BLK# R201
16586
BLK# R202
NUMBER ON THE STACK: 6 BLK# R200
BLK# R201
16586
BLK# R202
NUMBER ON THE STACK: 5 BLK# R200
BLK# R201
16586
BLK# R202
NUMBER ON THE STACK: 4 BLK# R200
BLK# R201
16586
BLK# R202
NUMBER ON THE STACK: 3 BLK# R200
BLK# R201
16586
BLK# R202
NUMBER ON THE STACK: 2 BLK# R200
BLK# R201
16586
BLK# R202
NUMBER ON THE STACK: 1 BLK# R200
BLK# R201
16586
BLK# R202
FINISHED
LET'S PULL A FAST ONE ON THRU
AND SEE IF IT RUNS WILD!!!
RAM BLK # 203
LET'S PULL A FAST ONE ON THRU
AND SEE IF IT RUNS WILD!!!
RAM BLK #204 OK
has the same effect.
I honestly don't know if anyone has implemented THRU without a DO LOOP . This version of THRU is compatible with --> and the versions of [BEGIN] and [UNTIL] capable of spanning multiple blocks.
It is important when writing a new version of THRU that it not place anything on the data stack which could get in the way of parameters left from a previous screen of source. A definition spanning more than one block would need the data stack to be unaffected by THRU .
As an example, here is the source stored in 4 blocks in the C64's REU.
: THRU ( U1 U2 -- )
1+ SWAP
DO
5 ?CR
I DUP U. LOAD DONE? ?LEAVE
LOOP ;
Although on my Forth, I took advantage of how DO LOOP's work to make the version with a DO LOOP two bytes smaller.
The original THRU uses a DO LOOP so the loop parameters are on the return stack when LOAD executes. The new version only places one parameter on the return stack which is good when THRU is nested multiple times.
Another advantage of this new implementation of THRU has to do with testing modifications to source. If a change is made to a word the following words can be compiled and tested one screen (Forth block with source) at a time. If the source for a word happens to span multiple screens then they should not be loaded one at a time. Such screens can be linked together by placing --> at the bottom of all but the last screen for a word with source spanning multiple screens.
--> -- I,M,79 "next-block"
-- (compilation)
Continue interpretation on the next sequential block. May
be used within a colon definition that crosses a block
boundary.
As I mentioned in my last post, the worst effect of mixing --> with the new implementation of THRU is causing the rest of the screens linked by --> to be loaded.
If it is not feasible or desirable to modify WORD to save the value of BLK , and optionally >IN , to a variable such as HISTORY , LOAD can be redefined to save BLK to a HISTORY variable just before the previous values of BLK and >IN are pulled from the return stack and restored.
A generic LOAD
I suppose an implementation of THRU which is compatible with --> isn't exactly run of the mill. Whether it is or not, is that compatibility worthwhile?