6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Mon Jun 17, 2024 2:06 pm

All times are UTC




Post new topic Reply to topic  [ 4 posts ] 
Author Message
 Post subject: Multi Code Field Words
PostPosted: Sun May 26, 2024 9:11 pm 
Offline

Joined: Fri May 05, 2017 9:27 pm
Posts: 866

I have mentioned multi code field words on this forum. While the example I used worked, it left much to be desired.
A multi code field word is a word with more than one code field. Multi code field words typically have three code fields in the examples I've seen. In the following, I will be discussing multi code fields words on an ITC Forth system.
The Forth system is not aware of any code field but the first, the other two are seen as data.
The fields of a dictionary entry for my Forth:
Code:
LINK FIELD
NAME FIELD
CODE FIELD
PARAMETER FIELD

The fields of a dictionary entry for a multi code field word for my Forth:
Code:
LINK FIELD
NAME FIELD
CODE FIELD 1 -- also the primary code field
CODE FIELD 2
CODE FIELD 3
PARAMETER FIELD

How the Forth system sees the fields of a multi code field word:
Code:
LINK FIELD
NAME FIELD
CODE FIELD 1
PARAMETER FIELD DATA -- one cell
PARAMETER FIELD DATA -- one cell
REST OF PARAMETER FIELD

The first code field points to code which is normally written to skip over the next two code fields and access the body of the multi code field word. This is the code field executed or compiled by the Forth system.
The second code field of a multi code field word can be accessed by using >BODY . Suppose there is a word named ARGO which has three code fields and the second code field stores a number from the stack to the parameter field of ARGO . This is how the second code field would be executed or compiled (assuming a number is on the stack).
Code:
\ Executing.
' ARGO >BODY EXECUTE
\ Or when compiling.
[ ' ARGO >BODY , ]

and the third code field can be accessed by using >BODY twice.
Code:
\ Executing.
' ARGO >BODY >BODY EXECUTE
\ Or when compiling.
[ ' ARGO >BODY >BODY , ]

There are usually prefix words to access the second and third code fields of a multi code field word. TO is normally the name of the word to access the second code field. Depending on the value of STATE , TO will either execute the second code field or compile the second code field.
The second code field points to code which is normally written to skip over the third code field and access the body of the multi code field word.
Code:
\ USED FOR INTERPRETING OR COMPILING.
TO ARGO

AT is normally the name of the word to access the third code field of a multi code field word.
Code:
\ Used for interpreting or compiling.
AT ARGO

With nothing to skip over, the third code field normally behaves more like a typical code field.
TO and AT are immediate words and their definitions (without error checking) are straight forward.
Code:
: TO  ( ++ )
   ' >BODY  STATE @
   IF  , EXIT  THEN
   EXECUTE ; IMMEDIATE

: AT  ( ++ )
   ' >BODY >BODY STATE @
   IF  , EXIT  THEN
   EXECUTE ; IMMEDIATE

AT can be shortened.
Code:
: AT  ( ++ )
   ' >BODY
   BRANCH [ ' TO >BODY CELL+ , ] -; IMMEDIATE


Writing VALUE as a word to create multi code field words.
The code for the extra code fields were low level code in the example I presented in a previous post. For this example I'd like to write the code for the code fields in high level Forth. In other words, I would like to use DOES> clauses instead of ;CODE clauses, at least for this example.
Rather than writing something like the following:
Code:
SUBR CFA2              \ CREATE A SUBROUTINE.
   ' FORTH @ 1+ @ JMP  \ ASSEMBLE A JUMP TO DO.DOES>
   ] >BODY ! EXIT [ END-CODE
SUBR CFA3              \ CREATE A SUBROUTINE.         
   ' FORTH @ 1+ @ JMP  \ ASSEMBLE A JUMP TO DO.DOES>
   ] EXIT [ END-CODE
   
: VALUE
   CREATE  ( N -- )
   CFA2 ,  ( N -- )     \ LAY DOWN SECOND CODE FIELD.
   CFA3 ,  ( -- ADR )   \ LAY DOWN THIRD CODE FIELD.
   ,                    \ LAY DOWN PARAMETER FIELD (ONE CELL).
   DOES>  ( -- N )      \ SET FIRST CODE FIELD TO THIS.
      >BODY >BODY @ ;

I'd like to be able to write something like the following where the first DOES> clause compiles the code pointed to by the first code field, the second DOES> clause compiles the code pointed to by the second code field and so on.
Code:
: VALUE
   CREATE  ( N -- )
      ,
   DOES>  ( -- N )
      >BODY >BODY @ ;
   DOES>  ( N -- )  \ This is never executed.
      >BODY ! ;
   DOES>  ( -- ADR )
      ;

But this will not work.
With a few new words, a version of VALUE can be written which looks close.
Code:
: VALUE
   CREATE  ( N -- )
      ,
   +CFA  +CFA           \ Two extra code fields.
   DOES>                \ The first code field points here.
      >BODY >BODY @ ;;  \ Definition is not complete.
   DOES>                \ The second code field points here.
      >BODY ! ;;        \ Definition is still not complete.
   DOES>                \ The third code field points here.
      ;                 \ Now it's complete.

Notice the first and second DOES> clauses end with a double semicolon, ;; , but not the last DOES> clause.
;CODE clauses are also supported.
Code:
: VALUE
   CREATE  ( N -- )
      ,
   +CFA +CFA
   ;CODE  ( -- N )
      6 # LDY
      ' BL @ CELL+ JMP  END-CODE;  \ Not done.
   ;CODE  ( N2 -- )
      4 # LDY
      0 ,X LDA  W )Y STA  INY
      1 ,X LDA  W )Y STA
      POP JMP  END-CODE;           \ Not done.
   ;CODE  ( -- ADR )
      ' FENCE @ JMP
      END-CODE                     \ Now it's done.

The first and second ;CODE clauses end with END-CODE; ( END-CODE semicolon), but not the last ;CODE clause.
DOES> and ;CODE clauses can even be mixed in the same parent word.
Code:
: VALUE
   CREATE  ( N -- )
      ,
   +CFA  +CFA           \ Two extra code fields.
   DOES>                \ First code field does this.
      >BODY >BODY @ ;;  \ Definition is not complete.
   DOES>                \ Second code field does this.
      >BODY ! ;;        \ Definition is still not complete.
   ;CODE  ( -- ADR )    \ Last code field does this.
      ' FENCE @ JMP
      END-CODE          \ Definition is now complete.

All DOES> clauses which are not for the last code field must end with ;; (double semicolon) rather than ; (semicolon) and all ;CODE clauses which are not for the last code field must end with END-CODE; ( END-CODE semicolon) rather than END-CODE .

Now for the implementation.
Code:
: (+CFA)  ( -- )
   LATEST R> DUP CELL+ >R  @ 2DUP U<
   ABORT" NEED CREATE"
   ENTER
   NAME> DUP >BODY
   HERE 2PICK - CMOVE> CELL ALLOT ;

(+CFA) has error checking to make sure a parent word does not try to modify itself.
This version without the error checking is presented for clarity.
Code:
: (+CFA)  ( -- )
   R> DUP CELL+ >R  @ ENTER  \ Get inline address and enter it.
   LATEST NAME> DUP >BODY    \ move data between CFA and HERE
   HERE 2PICK - CMOVE> CELL ALLOT ;   \ up a cell and ALLOT.

(+CFA) has an inline address used by ENTER . ENTER is almost like a Forth computed GOSUB in that it will take the address of a threaded code fragment and cause program flow to continue in that fragment until an EXIT is encountered, then program flow resumes after ENTER .
R> DUP CELL+ >R @ copies the inline address to the data stack and increments the return address past this inline address. ENTER uses this address.
The rest of (+CFA) moves all data up one cell from the code field up to HERE and allots one cell.

My Forth does not check the data stack depth when defining a colon or code word. Colon and CODE place the address of the name to unsmudge and the security value of TRUE on the data stack. Semicolon and END-CODE check the security value and unsmudge the name at the address on the stack. I feel this is in keeping with the Forth-83 Standard. This wasn't so much my conforming to the Forth-83 Standard as it was my discovering that the standard specified something I was going to do anyway.
Code:
          :            -- sys                        M,79          "colon"
               A defining word executed in the form:
                       : <name> ... ;               

Code:
          ;            --                            C,I,79   "semi-colon"
                       sys --   (compiling)         

DOES> and ;CODE check that the security value is TRUE but do not disturb the stack.
Code:
: +CFA  ( -- )
        ( SYS1 -- SYS2 SYS1 )  ( COMPILING)
   COMPILE (+CFA) >MARK
   [COMPILE] CS-SWAP ; IMMEDIATE

+CFA compiles the runtime word, (+CFA) , and uses >MARK to reserve one cell and leave a reference to resolve. Since DOES> and ;CODE check the security value for colon's sys, CS-SWAP is used to move +CFA's control flow data under colon's.
Code:
 
: ;;  ( -- )
      ( SYS2 SYS1 -- SYS1 )  ( COMPILING)
   COMPILE EXIT
   [COMPILE] CS-SWAP >RESOLVE ; IMMEDIATE

;; (double semicolon) compiles EXIT then swaps the control flow data and resolves the forward reference left by +CFA . STATE is still compiling.
Code:
: END-CODE;  ( -- )
             ( SYS2 SYS1 -- SYS1 )  ( COMPILING)
   CURRENT @ CONTEXT ! ]
   [COMPILE] CS-SWAP >RESOLVE ; IMMEDIATE

END-CODE; sets the CONTEXT vocabulary to be the same as the CURRENT vocabulary and switches on compiling. It swaps the control flow data and resolves the forward reference left by +CFA .

Here is a disassembly/decompilation of the version of VALUE with two DOES> clauses and one ;CODE clause.
Code:
SEE VALUE
VALUE
 24371  8970 CREATE
 24373  8254 ,
 24375 24187 (+CFA)
 24377 24407 VALUE +38
 24379 24187 (+CFA)
 24381 24396 VALUE +27
 24383  7252 DOES
 24385  9524    JMP ' USER >BODY 21 +
 24388  6635 >BODY
 24390  6635 >BODY
 24392  3567 @
 24394  2471 EXIT
25
 OK
EAD :DIS
 24396  7252 DOES
 24398  9524    JMP ' USER >BODY 21 +
 24401  6635 >BODY
 24403  2080 !
 24405  2471 EXIT
11
 OK
EAD :DIS
 24407  7252 DOES
 24409  9132    JMP ' CREATE >BODY 160 +
5
 OK

Without a forward branch, my current version of SEE stops when an EXIT is encountered. EAD in this instance returns the address just after EXIT and :DIS takes that address to decompile threaded code.
My current version of SEE does not know about (+CFA) .
An improved SEE which did know about (+CFA)'s inline address and kept going until the end of the definition would show this:
Code:
SEE VALUE
VALUE
 24371  8970 CREATE
 24373  8254 ,
 24375 24187 (+CFA) 24407
 24379 24187 (+CFA) 24396
 24383  7252 DOES
 24385  9524    JMP ' USER >BODY 21 +
 24388  6635 >BODY
 24390  6635 >BODY
 24392  3567 @
 24394  2471 EXIT
 24396  7252 DOES
 24398  9524    JMP ' USER >BODY 21 +
 24401  6635 >BODY
 24403  2080 !
 24405  2471 EXIT
 24407  7252 DOES
 24409  9132    JMP ' CREATE >BODY 160 +
41
 OK

The first (+CFA) causes the Forth threaded code fragment at address 24407 to be entered, then moves the code field and parameter field up by one cell. Execution resumes at the second (+CFA) which causes the Forth threaded code fragment at address 24396 to be entered, then moves the code fields and parameter field up by one cell. Execution resumes at the first DOES .
Here is a trace of VALUE as the VALUE LIFE is defined.
Code:
TRACE VALUE  OK
42 VALUE LIFE
CREATE       42
,            42
(+CFA)    EMPTY
DOES      24416
(+CFA)    EMPTY
DOES      24416
DOES      EMPTY  OK

To reiterate, the first (+CFA) causes the last DOES to execute. The second (+CFA) causes the second DOES to execute. Finally, the first DOES is executed.

A word to test life:
Code:
: TEST.LIFE  ( -- )
   512 TO LIFE
   CR LIFE .
   CR AT LIFE U. ;

Now to SEE TEST.LIFE
Code:
SEE TEST.LIFE
TEST.LIFE
 24445  3264 LIT 512
 24449 24423 LIFE +2
 24451  6785 CR
 24453 24421 LIFE
 24455  7488 .
 24457  6785 CR
 24459 24425 LIFE +4
 24461  7501 U.
 24463  2471 EXIT
20
 OK

and execute it while VALUE is still being traced.
Code:
TEST.LIFE
>BODY       512 24425
!           512 24427
EXIT      EMPTY

>BODY     24423
>BODY     24425
@         24427
EXIT        512 512
24427  OK

and now to execute TEST.LIFE while TEST.LIFE is traced.
Code:
TRACE TEST.LIFE  OK
TEST.LIFE
LIT       EMPTY
LIFE +2     512
CR        EMPTY

LIFE      EMPTY
.           512 512
CR        EMPTY

LIFE +4   EMPTY
U.        24427 24427
EXIT      EMPTY  OK

By placing the +CFA words just before the first DOES> , it is possible to write a version of TO and AT with security.
Code:
: MCFA?  ( ++ CFA )
   ' DUP @ 6 - @ ['] (+CFA) <>
   ABORT" NOT A MULTI CFA WORD" ;

: TO  ( ++ )
   MCFA?
   >BODY STATE @
   IF  , EXIT  THEN
   EXECUTE ; IMMEDIATE

: AT  ( ++ )
   MCFA?  >BODY
   BRANCH [ ' TO >BODY CELL+ , ] -; IMMEDIATE

I've shown examples with three code fields, but there could be more or there could be only two. There needs to be one +CFA for every additional code field.
This is the general form with four code fields:
Code:
: <PARENT.NAME>
   CREATE
      <COMPILE OR ALLOT DATA FOR ACTUAL PARAMETER FIELD>
   +CFA +CFA +CFA
   DOES>
      <ACTION FOR FIRST CODE FIELD> ;;
   DOES>
      <ACTION FOR SECOND CODE FIELD> ;;
   DOES>
      <ACTION FOR THIRD CODE FIELD> ;;
   DOES>
      <ACTION FOR LAST CODE FIELD> ;  \ ordinary semicolon.

Notice that there are four DOES> clauses but only three instances of +CFA . The first three DOES> clauses end with double semicolon but the last ends with the ordinary semicolon.

In this post I presented one way to define parent words of multi code field words. This is by no means the only way.


Top
 Profile  
Reply with quote  
PostPosted: Thu May 30, 2024 2:07 am 
Offline

Joined: Fri May 05, 2017 9:27 pm
Posts: 866
In my last post I used this as an example:
Code:
: VALUE
   CREATE  ( N -- )
      ,
   +CFA  +CFA           \ TWO EXTRA CODE FIELDS.
   DOES>
      >BODY >BODY @ ;;  \ DEFINITION IS NOT COMPLETE.
   DOES>
      >BODY ! ;;        \ DEFINITION IS STILL NOT COMPLETE.
   ;CODE  ( -- ADR )
      ' FENCE @ JMP
      END-CODE          \ DEFINITION IS NOW COMPLETE.

I did not have to make the third code field point to a jump to do.variable, I can make it point directly to do.variable by doing nothing.
Code:
: VALUE
   CREATE  ( N -- )
      ,
   +CFA  +CFA           \ TWO EXTRA CODE FIELDS.
   DOES>
      >BODY >BODY @ ;;  \ DEFINITION IS NOT COMPLETE.
   DOES>
      >BODY ! ;;        \ DEFINITION IS STILL NOT COMPLETE.
   ;                    \ DEFINITION IS NOW COMPLETE.

When CREATE creates a new dictionary entry, the new word's code field points to do.variable. With this method of creating multi code field words, the last code field is set first, then the next to last up until the first one is set. As each code field is set, it and everything after it up to HERE is moved up by one cell (two bytes on my system). As the first code field to be set, the last code field can simply use the default value set by CREATE . For all code fields but the last, if a code field is not explicitly set to point somewhere, it will point to the same place as the code field set prior to it.

Here is a version of VALUE which creates the same multi code field words.
Code:
: VALUE
   CREATE  ( N -- )
      ,
   +CFA                 \ AN EXTRA CODE FIELD.
   DOES>
      >BODY >BODY @ ;;  \ DEFINITION IS NOT COMPLETE.
   +CFA                 \ AN EXTRA CODE FIELD.
   DOES>
      >BODY ! ;;        \ DEFINITION IS STILL NOT COMPLETE.
   ;                    \ DEFINITION IS NOW COMPLETE.

The source for this version of VALUE is different as is the disassembly. Nevertheless, the words created by the two versions are identical.
This isn't surprising considering how ENTER works.
This is a decompilation of this version of VALUE
Code:
SEE VALUE EAD :DIS EAD :DIS
VALUE
 24421  8970 CREATE
 24423  8254 ,
 24425 24199 (+CFA)
 24427 24442 VALUE +23
 24429  7252 DOES
 24431  9524    JMP ' USER >BODY 21 +
 24434  6635 >BODY
 24436  6635 >BODY
 24438  3567 @
 24440  2471 EXIT
21

 24442 24199 (+CFA)
 24444 24457 VALUE +38
 24446  7252 DOES
 24448  9524    JMP ' USER >BODY 21 +
 24451  6635 >BODY
 24453  2080 !
 24455  2471 EXIT
15

 24457  2471 EXIT
2
 OK

and a trace as it creates a VALUE with the name ALPHA . The trace word .STEP was set to a word which shows the address as well as the name of the word about to be executed.
Code:
: (.ST5) DUP U. (.ST1) ;  OK
' (.ST5) IS .STEP  OK
137 VALUE ALPHA
24421 CREATE     137
24423 ,          137
24425 (+CFA)   EMPTY
24442 (+CFA)   EMPTY
24457 EXIT     EMPTY
24446 DOES     EMPTY
24429 DOES     EMPTY  OK
   OK

When the first (+CFA) executes, IP is pushed onto the return stack and program flow continues with the Forth code at address 24442, the second (+CFA) . Within this (+CFA) IP is also pushed onto the return stack. Program flow continues at the Forth code at address 24457, the EXIT at the end of VALUE . When this Forth code exits, program flow resumes within the second (+CFA) which moves everything from the CFA to HERE up by one cell. Program flow resumes with the second DOES , the word in my system compiled by DOES> and ;CODE . When DOES sets the code field of the latest word, it causes this thread of Forth to exit. Program flow resumes within the first (+CFA) which also moves everything from the CFA to HERE up by one cell. Program flow resumes with the first DOES which sets the code field for the latest word, causing an exit with program flow continuing in the word which executed VALUE .
Contrast this with a trace of CONSTANT as the constant DOZEN is created.
Code:
TRACE CONSTANT  OK
12 CONSTANT DOZEN
9464 CREATE       12
9466 ,            12
9468 DOES      EMPTY  OK

Setting the deferred word .STEP to a word which shows all three of the stacks in my Forth and tracing VALUE shows the nesting.
Code:
FORGET ALPHA  OK
' (.ST3) IS .STEP  OK
TRACE VALUE  OK
137 VALUE ALPHA
CREATE           137
               EMPTY
24421 VALUE
 8627 (I/C)
 8699 INTERPRET
 8773 QUIT

,                137
               EMPTY
24423 VALUE
 8627 (I/C)
 8699 INTERPRET
 8773 QUIT

(+CFA)         EMPTY
               EMPTY
24425 VALUE
 8627 (I/C)
 8699 INTERPRET
 8773 QUIT

(+CFA)         EMPTY
               EMPTY
24442 VALUE
24213 (+CFA)
24429 VALUE
 8627 (I/C)
 8699 INTERPRET
 8773 QUIT

EXIT           EMPTY
               EMPTY
24457 VALUE
24213 (+CFA)
24446 VALUE
24213 (+CFA)
24429 VALUE
 8627 (I/C)
 8699 INTERPRET
 8773 QUIT

DOES           EMPTY
               EMPTY
24446 VALUE
24213 (+CFA)
24429 VALUE
 8627 (I/C)
 8699 INTERPRET
 8773 QUIT

DOES           EMPTY
               EMPTY
24429 VALUE
 8627 (I/C)
 8699 INTERPRET
 8773 QUIT
 OK

Since +CFA , which compiles (+CFA) , can be placed just before each additional clause which sets a code field, the general form for defining a word which creates multi code field words can be written as:
Code:
: <NAME>
   CREATE
      <COMPILE DATA FOR CHILD WORD'S PARAMETER FIELD>
\ Do the following for each additional code field.
   +CFA
   <ACTION TO SET CODE FIELD OF LATEST WORD>
                .
                .
                .
   <ACTION TO SET LAST CODE FIELD OF LATEST WORD>

The action to set code field of latest word could be one of the following:
Code:
\ High level Forth -- note the double semicolon
   DOES>
      <HIGH LEVEL FORTH> ;;
Code:
\ Assembly -- note the semicolon as part of the name of END-CODE;
   ;CODE
      <ASSEMBLY LANGUAGE> END-CODE;
Code:
\ Setting code field to existing code -- note the double semicolon
   <ADDRESS> LATEST NAME> ! ;;

and the action to set the last code field of the latest word could be one of these:
Code:
\ High level Forth -- regular semicolon
   DOES>
      <HIGH LEVEL FORTH> ;
Code:
\ Assembly -- regular END-CODE
   ;CODE
      <ASSEMBLY LANGUAGE> END-CODE
Code:
\ Setting code field to existing code -- regular semicolon
   <ADDRESS> LATEST NAME> ! ;
Code:
\ Leave code field set to do.variable -- regular semicolon
   ;



Top
 Profile  
Reply with quote  
PostPosted: Tue Jun 04, 2024 10:58 pm 
Offline

Joined: Fri May 05, 2017 9:27 pm
Posts: 866

I found a problem when using this technique to add a second code field to a CREATE word. This particular CREATE word creates a word with a three cell parameter field. The third cell in the parameter field initially points to the first cell. Why doesn't matter. What matters is the resulting child word has a parameter field with what is initially non-relocatable data. The word +CFA is used to execute the code to set the second, third, etc. code fields. Since +CFA's runtime moves everything from the default CFA through the end of the dictionary up in memory by one cell, the resulting child words for this CREATE word would have the third cell of the parameter field pointing to the child word's second code field rather than the start of the parameter field. The solution was simple and did not require changing how the parameter field data is compiled. Move all occurrences of the word +CFA (only one in this case) to before the parameter field data is compiled.
In this instance, changing this:
Code:
: <NAMEX>
   CREATE
      HERE <DATA> , <DATA> , ,
   +CFA
   ;CODE
       ... END-CODE;
   ;CODE
       ... END-CODE

to this:
Code:
: <NAMEX>
   CREATE
   +CFA                         \ This causes a nested branch
      HERE <DATA> , <DATA> , ,
   ;CODE
       ... END-CODE;
                                \ to here followed by a return.
   ;CODE
       ... END-CODE

+CFA's runtime causes a 'nested branch' or 'computed gosub' to the high level code after END-CODE; followed by a return to what is after +CFA . The same code fields are set, but the parameter field data does not get moved, it is compiled just before the default code field is set.
In summary: When a cell of the parameter field is a pointer to another cell of that parameter field, it is necessary (when using this technique) for all occurrences of +CFA to be before the code which compiles the parameter field.


Top
 Profile  
Reply with quote  
PostPosted: Wed Jun 05, 2024 3:43 am 
Offline
User avatar

Joined: Fri Aug 30, 2002 1:09 am
Posts: 8459
Location: Southern California
I've kept a tab open to this, with the intent to dig into it, but it looks like you'll be adding to it faster than I can catch up, LOL.

_________________
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 4 posts ] 

All times are UTC


Who is online

Users browsing this forum: No registered users and 5 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: