I have mentioned multi code field words on this forum. While the example I used worked, it left much to be desired.
A multi code field word is a word with more than one code field. Multi code field words typically have three code fields in the examples I've seen. In the following, I will be discussing multi code fields words on an ITC Forth system.
The Forth system is not aware of any code field but the first, the other two are seen as data.
The fields of a dictionary entry for my Forth:
Code:
LINK FIELD
NAME FIELD
CODE FIELD
PARAMETER FIELD
The fields of a dictionary entry for a multi code field word for my Forth:
Code:
LINK FIELD
NAME FIELD
CODE FIELD 1 -- also the primary code field
CODE FIELD 2
CODE FIELD 3
PARAMETER FIELD
How the Forth system sees the fields of a multi code field word:
Code:
LINK FIELD
NAME FIELD
CODE FIELD 1
PARAMETER FIELD DATA -- one cell
PARAMETER FIELD DATA -- one cell
REST OF PARAMETER FIELD
The first code field points to code which is normally written to skip over the next two code fields and access the body of the multi code field word. This is the code field executed or compiled by the Forth system.
The second code field of a multi code field word can be accessed by using >BODY . Suppose there is a word named ARGO which has three code fields and the second code field stores a number from the stack to the parameter field of ARGO . This is how the second code field would be executed or compiled (assuming a number is on the stack).
Code:
\ Executing.
' ARGO >BODY EXECUTE
\ Or when compiling.
[ ' ARGO >BODY , ]
and the third code field can be accessed by using >BODY twice.
Code:
\ Executing.
' ARGO >BODY >BODY EXECUTE
\ Or when compiling.
[ ' ARGO >BODY >BODY , ]
There are usually prefix words to access the second and third code fields of a multi code field word. TO is normally the name of the word to access the second code field. Depending on the value of STATE , TO will either execute the second code field or compile the second code field.
The second code field points to code which is normally written to skip over the third code field and access the body of the multi code field word.
Code:
\ USED FOR INTERPRETING OR COMPILING.
TO ARGO
AT is normally the name of the word to access the third code field of a multi code field word.
Code:
\ Used for interpreting or compiling.
AT ARGO
With nothing to skip over, the third code field normally behaves more like a typical code field.
TO and AT are immediate words and their definitions (without error checking) are straight forward.
Code:
: TO ( ++ )
' >BODY STATE @
IF , EXIT THEN
EXECUTE ; IMMEDIATE
: AT ( ++ )
' >BODY >BODY STATE @
IF , EXIT THEN
EXECUTE ; IMMEDIATE
AT can be shortened.
Code:
: AT ( ++ )
' >BODY
BRANCH [ ' TO >BODY CELL+ , ] -; IMMEDIATE
Writing VALUE as a word to create multi code field words.
The code for the extra code fields were low level code in the example I presented in a previous post. For this example I'd like to write the code for the code fields in high level Forth. In other words, I would like to use DOES> clauses instead of ;CODE clauses, at least for this example.
Rather than writing something like the following:
Code:
SUBR CFA2 \ CREATE A SUBROUTINE.
' FORTH @ 1+ @ JMP \ ASSEMBLE A JUMP TO DO.DOES>
] >BODY ! EXIT [ END-CODE
SUBR CFA3 \ CREATE A SUBROUTINE.
' FORTH @ 1+ @ JMP \ ASSEMBLE A JUMP TO DO.DOES>
] EXIT [ END-CODE
: VALUE
CREATE ( N -- )
CFA2 , ( N -- ) \ LAY DOWN SECOND CODE FIELD.
CFA3 , ( -- ADR ) \ LAY DOWN THIRD CODE FIELD.
, \ LAY DOWN PARAMETER FIELD (ONE CELL).
DOES> ( -- N ) \ SET FIRST CODE FIELD TO THIS.
>BODY >BODY @ ;
I'd like to be able to write something like the following where the first DOES> clause compiles the code pointed to by the first code field, the second DOES> clause compiles the code pointed to by the second code field and so on.
Code:
: VALUE
CREATE ( N -- )
,
DOES> ( -- N )
>BODY >BODY @ ;
DOES> ( N -- ) \ This is never executed.
>BODY ! ;
DOES> ( -- ADR )
;
But this will not work.
With a few new words, a version of VALUE can be written which looks close.
Code:
: VALUE
CREATE ( N -- )
,
+CFA +CFA \ Two extra code fields.
DOES> \ The first code field points here.
>BODY >BODY @ ;; \ Definition is not complete.
DOES> \ The second code field points here.
>BODY ! ;; \ Definition is still not complete.
DOES> \ The third code field points here.
; \ Now it's complete.
Notice the first and second DOES> clauses end with a double semicolon, ;; , but not the last DOES> clause.
;CODE clauses are also supported.
Code:
: VALUE
CREATE ( N -- )
,
+CFA +CFA
;CODE ( -- N )
6 # LDY
' BL @ CELL+ JMP END-CODE; \ Not done.
;CODE ( N2 -- )
4 # LDY
0 ,X LDA W )Y STA INY
1 ,X LDA W )Y STA
POP JMP END-CODE; \ Not done.
;CODE ( -- ADR )
' FENCE @ JMP
END-CODE \ Now it's done.
The first and second ;CODE clauses end with END-CODE; ( END-CODE semicolon), but not the last ;CODE clause.
DOES> and ;CODE clauses can even be mixed in the same parent word.
Code:
: VALUE
CREATE ( N -- )
,
+CFA +CFA \ Two extra code fields.
DOES> \ First code field does this.
>BODY >BODY @ ;; \ Definition is not complete.
DOES> \ Second code field does this.
>BODY ! ;; \ Definition is still not complete.
;CODE ( -- ADR ) \ Last code field does this.
' FENCE @ JMP
END-CODE \ Definition is now complete.
All DOES> clauses which are not for the last code field must end with ;; (double semicolon) rather than ; (semicolon) and all ;CODE clauses which are not for the last code field must end with END-CODE; ( END-CODE semicolon) rather than END-CODE .
Now for the implementation.
Code:
: (+CFA) ( -- )
LATEST R> DUP CELL+ >R @ 2DUP U<
ABORT" NEED CREATE"
ENTER
NAME> DUP >BODY
HERE 2PICK - CMOVE> CELL ALLOT ;
(+CFA) has error checking to make sure a parent word does not try to modify itself.
This version without the error checking is presented for clarity.
Code:
: (+CFA) ( -- )
R> DUP CELL+ >R @ ENTER \ Get inline address and enter it.
LATEST NAME> DUP >BODY \ move data between CFA and HERE
HERE 2PICK - CMOVE> CELL ALLOT ; \ up a cell and ALLOT.
(+CFA) has an inline address used by ENTER . ENTER is almost like a Forth computed GOSUB in that it will take the address of a threaded code fragment and cause program flow to continue in that fragment until an EXIT is encountered, then program flow resumes after ENTER .
R> DUP CELL+ >R @ copies the inline address to the data stack and increments the return address past this inline address. ENTER uses this address.
The rest of (+CFA) moves all data up one cell from the code field up to HERE and allots one cell.
My Forth does not check the data stack depth when defining a colon or code word. Colon and CODE place the address of the name to unsmudge and the security value of TRUE on the data stack. Semicolon and END-CODE check the security value and unsmudge the name at the address on the stack. I feel this is in keeping with the Forth-83 Standard. This wasn't so much my conforming to the Forth-83 Standard as it was my discovering that the standard specified something I was going to do anyway.
Code:
: -- sys M,79 "colon"
A defining word executed in the form:
: <name> ... ;
Code:
; -- C,I,79 "semi-colon"
sys -- (compiling)
DOES> and ;CODE check that the security value is TRUE but do not disturb the stack.
Code:
: +CFA ( -- )
( SYS1 -- SYS2 SYS1 ) ( COMPILING)
COMPILE (+CFA) >MARK
[COMPILE] CS-SWAP ; IMMEDIATE
+CFA compiles the runtime word, (+CFA) , and uses >MARK to reserve one cell and leave a reference to resolve. Since DOES> and ;CODE check the security value for colon's sys, CS-SWAP is used to move +CFA's control flow data under colon's.
Code:
: ;; ( -- )
( SYS2 SYS1 -- SYS1 ) ( COMPILING)
COMPILE EXIT
[COMPILE] CS-SWAP >RESOLVE ; IMMEDIATE
;; (double semicolon) compiles EXIT then swaps the control flow data and resolves the forward reference left by +CFA . STATE is still compiling.
Code:
: END-CODE; ( -- )
( SYS2 SYS1 -- SYS1 ) ( COMPILING)
CURRENT @ CONTEXT ! ]
[COMPILE] CS-SWAP >RESOLVE ; IMMEDIATE
END-CODE; sets the CONTEXT vocabulary to be the same as the CURRENT vocabulary and switches on compiling. It swaps the control flow data and resolves the forward reference left by +CFA .
Here is a disassembly/decompilation of the version of VALUE with two DOES> clauses and one ;CODE clause.
Code:
SEE VALUE
VALUE
24371 8970 CREATE
24373 8254 ,
24375 24187 (+CFA)
24377 24407 VALUE +38
24379 24187 (+CFA)
24381 24396 VALUE +27
24383 7252 DOES
24385 9524 JMP ' USER >BODY 21 +
24388 6635 >BODY
24390 6635 >BODY
24392 3567 @
24394 2471 EXIT
25
OK
EAD :DIS
24396 7252 DOES
24398 9524 JMP ' USER >BODY 21 +
24401 6635 >BODY
24403 2080 !
24405 2471 EXIT
11
OK
EAD :DIS
24407 7252 DOES
24409 9132 JMP ' CREATE >BODY 160 +
5
OK
Without a forward branch, my current version of SEE stops when an EXIT is encountered. EAD in this instance returns the address just after EXIT and :DIS takes that address to decompile threaded code.
My current version of SEE does not know about (+CFA) .
An improved SEE which did know about (+CFA)'s inline address and kept going until the end of the definition would show this:
Code:
SEE VALUE
VALUE
24371 8970 CREATE
24373 8254 ,
24375 24187 (+CFA) 24407
24379 24187 (+CFA) 24396
24383 7252 DOES
24385 9524 JMP ' USER >BODY 21 +
24388 6635 >BODY
24390 6635 >BODY
24392 3567 @
24394 2471 EXIT
24396 7252 DOES
24398 9524 JMP ' USER >BODY 21 +
24401 6635 >BODY
24403 2080 !
24405 2471 EXIT
24407 7252 DOES
24409 9132 JMP ' CREATE >BODY 160 +
41
OK
The first (+CFA) causes the Forth threaded code fragment at address 24407 to be entered, then moves the code field and parameter field up by one cell. Execution resumes at the second (+CFA) which causes the Forth threaded code fragment at address 24396 to be entered, then moves the code fields and parameter field up by one cell. Execution resumes at the first DOES .
Here is a trace of VALUE as the VALUE LIFE is defined.
Code:
TRACE VALUE OK
42 VALUE LIFE
CREATE 42
, 42
(+CFA) EMPTY
DOES 24416
(+CFA) EMPTY
DOES 24416
DOES EMPTY OK
To reiterate, the first (+CFA) causes the last DOES to execute. The second (+CFA) causes the second DOES to execute. Finally, the first DOES is executed.
A word to test life:
Code:
: TEST.LIFE ( -- )
512 TO LIFE
CR LIFE .
CR AT LIFE U. ;
Now to SEE TEST.LIFE
Code:
SEE TEST.LIFE
TEST.LIFE
24445 3264 LIT 512
24449 24423 LIFE +2
24451 6785 CR
24453 24421 LIFE
24455 7488 .
24457 6785 CR
24459 24425 LIFE +4
24461 7501 U.
24463 2471 EXIT
20
OK
and execute it while VALUE is still being traced.
Code:
TEST.LIFE
>BODY 512 24425
! 512 24427
EXIT EMPTY
>BODY 24423
>BODY 24425
@ 24427
EXIT 512 512
24427 OK
and now to execute TEST.LIFE while TEST.LIFE is traced.
Code:
TRACE TEST.LIFE OK
TEST.LIFE
LIT EMPTY
LIFE +2 512
CR EMPTY
LIFE EMPTY
. 512 512
CR EMPTY
LIFE +4 EMPTY
U. 24427 24427
EXIT EMPTY OK
By placing the +CFA words just before the first DOES> , it is possible to write a version of TO and AT with security.
Code:
: MCFA? ( ++ CFA )
' DUP @ 6 - @ ['] (+CFA) <>
ABORT" NOT A MULTI CFA WORD" ;
: TO ( ++ )
MCFA?
>BODY STATE @
IF , EXIT THEN
EXECUTE ; IMMEDIATE
: AT ( ++ )
MCFA? >BODY
BRANCH [ ' TO >BODY CELL+ , ] -; IMMEDIATE
I've shown examples with three code fields, but there could be more or there could be only two. There needs to be one +CFA for every additional code field.
This is the general form with four code fields:
Code:
: <PARENT.NAME>
CREATE
<COMPILE OR ALLOT DATA FOR ACTUAL PARAMETER FIELD>
+CFA +CFA +CFA
DOES>
<ACTION FOR FIRST CODE FIELD> ;;
DOES>
<ACTION FOR SECOND CODE FIELD> ;;
DOES>
<ACTION FOR THIRD CODE FIELD> ;;
DOES>
<ACTION FOR LAST CODE FIELD> ; \ ordinary semicolon.
Notice that there are four DOES> clauses but only three instances of +CFA . The first three DOES> clauses end with double semicolon but the last ends with the ordinary semicolon.
In this post I presented one way to define parent words of multi code field words. This is by no means the only way.