When I wrote my metacompiler my main goal was to get my Forth kernel written and built. The metacompiler works and overall I am pleased with it; however, there is room for improvement. As the topic suggests, this thread is about the redesign of the Fleet Forth metacompiler.
The first goal, or milestone, with the new metacompiler will be to simply build a CODE word in virtual memory complete with a header and an alias in a target Forth vocabulary on the host (a handle to access the new word). Although the ability to build CODE words in virtual memory may seem more difficult because a target assembler is required, that part isn't that difficult. Supporting CREATE DOES> words in the target is, if not trickier, then definitely messier.
Fleet Forth's metacompiler needs access to virtual memory to build the new kernel. I discuss Fleet Forth's virtual memory words in this post. Fleet Forth's virtual memory uses Forth blocks with a high enough number that they are mapped to the Ram Expansion Unit, but could just as well be mapped to one of the disk drives. Virtual memory doesn't have to be set up this way. Since the kernel is less than ten kilobytes, an array could be created like this.
Code:
CREATE VIRTUAL 10240 ALLOT
In this case >VIRTUAL is just this.
Code:
: >VIRTUAL ( ADR -- VADR )
VIRTUAL + ;
Regardless of how virtual memory is implemented, the metacompiler needs the following virtual memory words, which are in Fleet Forth even without the metacompiler.
Code:
>VIRTUAL ( adr -- vadr ) -- convert an address to the address in virtual memory buffer.
VC@ ( vadr -- b ) -- fetch a byte from virtual memory.
VC! ( b vadr -- ) -- store a byte in virtual memory.
V@ ( vadr -- w ) -- fetch a cell from virtual memory.
V! ( w vadr -- ) -- store a cell in virtual memory.
CMOVE>V ( adr vadr u ) -- copy u bytes at adr to virtual memory.
VMSAVE ( start.adr end.adr+1 name.adr cnt -- )
-- save virtual memory to disk.
The metacompiler will need to build a dictionary in the target.
Code:
VARIABLE TDP
VARIABLE (ORIGIN) (ORIGIN) ON
: ORIGIN ( ADR -- )
DUP (ORIGIN) @ UMIN (ORIGIN) !
TDP ! ;
: THERE ( -- VADR )
TDP @ ;
: VALLOT ( N -- )
TDP +! ;
: VC, ( B -- )
THERE 1 VALLOT VC! ;
: V, ( W -- )
THERE 2 VALLOT V! ;
: VADD ( ADR -- )
THERE OVER @ V, SWAP ! ;
TDP is the target DP .
THERE is the target HERE .
VALLOT is virtual ALLOT .
VC, and V, are virtual C, and , respectively.
VADD is the virtual memory equivalent of Fleet Forth's ADD . it takes an address in real memory (such as a variable) which is used to hold a chain of addresses and adds THERE to it.
Here is an example of the use of ADD in Fleet Forth to add a VOCABULARY to VOC-LINK , the chain of all vocabularies used by FORGET to prune vocabularies.
Code:
: VOCABULARY ( -- )
VARIABLE CURRENT @ ,
VOC-LINK ADD \ <- here
DOES>
CONTEXT ! ;
and a code snippet from Fleet Forth's CREATE to link the new word into the current vocabulary.
Code:
CURRENT @ ADD
A few string manipulating words are also needed.
Code:
: >THERE ( ADR CNT -- THERE )
>HERE THERE TUCK OVER C@ 2+
CMOVE>V ;
: VNAME ( -- HERE )
NAME DUP THERE OVER C@ 2+
CMOVE>V ;
CODE V," ( -- )
>FORTH
ASCII " CHAR
>THERE C@ 1+ VALLOT ;
>THERE takes an address of a string with a count and moves that string to HERE and THERE as a counted string with a trailing blank.
VNAME parses the text stream for a blank delimited name moves it to HERE and THERE as a counted string with a trailing blank.
V," is the metacompiler's version of ," . It parses the text stream for a " delimiter and copies that string to HERE and THERE as a counted string with a trailing blank.
A word to TOGGLE bits in an address is needed.
Code:
: VTOGGLE ( VADR B -- )
SWAP >VIRTUAL SWAP TOGGLE
UPDATE ;
And now a few more vocabularies.
Code:
FORTH DEFINITIONS
: HOST ( -- )
FORTH DEFINITIONS ; IMMEDIATE
: [HOST] FORTH ; IMMEDIATE
VOCABULARY META META DEFINITIONS
VOCABULARY SHADOW SHADOW DEFINITIONS
VOCABULARY FORTH FORTH DEFINITIONS
HOST
: [META] META ; IMMEDIATE
: [SHADOW] [META] SHADOW ; IMMEDIATE
: [TARGET] [SHADOW] FORTH ; IMMEDIATE
: TARGET [TARGET] FORTH DEFINITIONS ; IMMEDIATE
HOST
Notice that there are two Forth vocabularies. This is not a problem. The only vocabulary in Fleet Forth without a parent vocabulary is the original Forth vocabulary. Every other vocabulary has a parent, the vocabulary in which it was created, as the word VOCS shows. This is a portion of a session log after the second ASSEMBLER vocabulary was added.
Code:
VOCS
FORTH
META
SHADOW
FORTH
ASSEMBLER
EDITOR
ASSEMBLER
OK
The metacompiler will need to build a header in virtual memory as well as creating a word in the target Forth vocabulary for each word defined for the target. The word VHEADER creates a header in virtual memory. It does not add a code field. HEADER executes VHEADER and creates an alias in the target Forth vocabulary.
Here are some variables and helper words for VHEADER .
Code:
VARIABLE WLINK
VARIABLE HEAD
VARIABLE HEADS
VARIABLE HEADLESS
VARIABLE PNAMES
: NH ( -- )
?EXEC HEAD OFF ; IMMEDIATE
: PNAME ( -- )
1 VALLOT 1 PNAMES +!
COLS ?CR ." PADDED: R"
HERE S? CR ;
WLINK holds the start of the chain of link fields.
If either HEAD or HEADS are ON then the word being defined in virtual memory will have a header. HEAD is normally used to switch headers off for one word at a time. HEADS is normally zero to allow some headerless words and true to forbid headerless words.
NH is used just before a word is defined and is a shortcut for the phrase HEAD OFF .
PNAME compensates for the indirect jump bug by padding memory before the link field is created. It also keeps track of the number of padded headers and displays the name of each as it occurs.
Here are some words for target Forth vocabulary only searches and some words used for peace of mind. the peace of mind words show if compiling to the host took place.
Code:
// TARGET SEARCH & SANITY CHECK
: TFIND ( AD -- AD 0/XT 1/XT -1 )
[TARGET] ['] FORTH
[HOST] >BODY VFIND ;
: ?TFIND ( F -- )
ABORT" NOT IN TARGET" ;
: T' NAME TFIND 0= ?TFIND
>BODY @ ;
VARIABLE CDP
: !CDP ( -- ) HERE CDP ! ;
: ?CDP ( -- ) HERE CDP @ <>
ABORT" COMPILED TO HOST" ;
VFIND is a primitive like (FIND) , but it only searches one vocabulary. It does not search parent vocabularies. It is also a word defined in the Fleet Forth kernel without a body of its own. Its code field points one byte into the body of (FIND) .
The word to build headers in virtual memory, VHEADER .
Code:
: VHEADER ( >IN@ -- >IN@ )
VNAME TFIND NIP
ABORT" REDEFINITION IN TARGET"
HEAD @ HEADS @ OR
IF
DUP >IN !
HERE C@ THERE + 4 +
SPLIT DROP 0=
IF PNAME THEN
WLINK VADD
VNAME C@ THERE $80 VTOGGLE
VALLOT THERE $80 VTOGGLE
1 VALLOT
EXIT
THEN
THERE 1+ SPLIT DROP 0=
IF PNAME THEN
1 HEADLESS +! ;
It takes the current value of >IN on the stack and returns the current value of >IN on the stack. HEADER leaves a copy on the stack anyway and there are two possible paths in VHEADER , one which parses a name to create a header and one which does not (but still pads memory if needed to avoid the code field straddling a page boundary).
The word to create the header in virtual memory via VHEADER and create an alias in the target Forth vocabulary.
Code:
: HEADER ( -- )
>IN @ VHEADER DUP >IN !
NAME FIND 0<> AND SWAP >IN !
HEAD ON
CREATE
THERE , , 0 ,
[ HERE 2+ >A ]
DOES>
2+ @ DUP 0=
ABORT" COUNTERPART NOT FOUND"
EXECUTE ;
A> CONSTANT TARGET-WORD-CF
TARGET-WORD-CF is a constant which is used later.
The auxiliary stack words >A and A> can be replaced with the phrases 2 ! and 2 @ respectively.
An alias in the target Forth vocabulary is a CREATE DOES> word with three cells in its body.
The first cell is the code field address of its counterpart in virtual memory.
The second cell is the address of its counterpart in the host system. It could be defined in the original FORTH vocabulary or the META vocabulary or even the SHADOW vocabulary. If there is no counterpart found in the host system, a zero is stored in that cell.
The third cell is for statistical data, how many times this word is compiled into another.
A word to initialize those variables and one to show some stats.
Code:
: START ( -- )
WLINK OFF (ORIGIN) ON
HEAD ON HEADS OFF
HEADLESS OFF PNAMES OFF
[COMPILE] TARGET ; IMMEDIATE
: META-STATUS SETWIDTH
CR ." STATE: " STATE @ U.
CR ." WLINK: " WLINK @ U.
CR ." HEAD: " HEAD @ U.
CR ." HEADS: " HEADS @ U.
CR ." HEADLESS: " HEADLESS @ U.
CR ." ORIGIN: " (ORIGIN) @ U.
CR ." THERE: " TDP @ U.
CR ." PADDED: " PNAMES @ U.
ORDER ;
The completed metacompiler with have a slightly more complex START and META-STATUS but not by much. These versions are adequate for now.
The last thing needed to be able to build CODE words in virtual memory is the target assembler.
Code:
// TIE IN WITH FLEET FORTH ASSEMBLER
TARGET VOCABULARY ASSEMBLER
HOST ASSEMBLER ' NOT >LINK
TARGET ' ASSEMBLER >BODY !
HOST
: [ASSEMBLER] [TARGET] ASSEMBLER ;
IMMEDIATE
TARGET ASSEMBLER DEFINITIONS
HEX
: CPU0 CREATE C, DOES> C@ VC, ;
0 CPU0 BRK 18 CPU0 CLC
0D8 CPU0 CLD 58 CPU0 CLI
0B8 CPU0 CLV 0CA CPU0 DEX
88 CPU0 DEY 0E8 CPU0 INX
0C8 CPU0 INY 0EA CPU0 NOP
48 CPU0 PHA 8 CPU0 PHP
68 CPU0 PLA 28 CPU0 PLP
40 CPU0 RTI 60 CPU0 RTS
38 CPU0 SEC 0F8 CPU0 SED
78 CPU0 SEI 0AA CPU0 TAX
0A8 CPU0 TAY 0BA CPU0 TSX
8A CPU0 TXA 9A CPU0 TXS
98 CPU0 TYA
HEX
: CPU1 ( BMAP BOP -- )
[ FORTH ]
CREATE C, , DOES>
Z DUP 1 MODE @ MEM DUP>R
LSHIFT AND 0=
ABORT" NON VALID ADDRESSING MODE"
0< $B AND R@ + TABLE + C@ + VC,
R> ?DUP 0EXIT
7 U< IF VC, EXIT THEN
V, ;
ASSEMBLER
HEX
073E 61 CPU1 ADC
073E 21 CPU1 AND
073E 0C1 CPU1 CMP
073E 41 CPU1 EOR
073E 0A1 CPU1 LDA
073E 01 CPU1 ORA
073E 0E1 CPU1 SBC
0736 81 CPU1 STA
0331 02 CPU1 ASL
0331 42 CPU1 LSR
0331 22 CPU1 ROL
0331 62 CPU1 ROR
8558 0A2 CPU1 LDX
8338 0A0 CPU1 LDY
HEX
0330 C2 CPU1 DEC
0330 E2 CPU1 INC
8118 E0 CPU1 CPX
8118 C0 CPU1 CPY
0150 82 CPU1 STX
0130 80 CPU1 STY
0110 20 CPU1 BIT
0180 40 CPU1 JMP
0100 14 CPU1 JSR
HEX
: >BRA ( N1 N2 -- N3 )
RB HEX
OVER SPLIT NIP OVER 1+ SPLIT NIP <>
IF
COLS ?CR ." PAGE CROSSING "
DUP ASCII V 2OVER 1+ U<
IF
[ ASCII ^ ASCII V - ] LITERAL
+
THEN
DUP EMIT EMIT SPACE 1- U.
." : R" LATEST ID. CR
THEN
BO ;
HEX
: THEN ( ADDR CS -- )
?EXEC 31 ?PAIRS
THERE OVER VC@
IF
SWAP V!
ELSE
OVER >BRA SWAP VC!
THEN
; IMMEDIATE
HEX
: IF ( CC -- ADR CS )
?EXEC VC, THERE 0 VC, 31
; IMMEDIATE
: AHEAD ( -- ADR CS )
?EXEC
THERE 1+ 1 JMP 31 ; IMMEDIATE
: WHILE ( A1 C1 -- A2 C2 A1 C1 )
[COMPILE] IF 2SWAP ; IMMEDIATE
: ELIF ( ADR1 CS -- ADR2 CS )
[COMPILE] WHILE
[COMPILE] THEN ; IMMEDIATE
: ELSE ( ADR1 CS -- ADR2 CS )
[COMPILE] AHEAD 2SWAP
[COMPILE] THEN ; IMMEDIATE
HEX
: BEGIN ( -- ADR CS ) ?EXEC
THERE 32 ; IMMEDIATE
: AGAIN ( ADR CS -- ) ?EXEC
32 ?PAIRS JMP ; IMMEDIATE
: UNTIL ( ADR CS CC -- ) ?EXEC
VC, 32 ?PAIRS
THERE >BRA VC,
; IMMEDIATE
: REPEAT ( ADR1 CS1 ADR2 CS2 -- )
[COMPILE] AGAIN
[COMPILE] THEN ; IMMEDIATE
: BRAN ( ADR CS -- )
NOT 32 SWAP
[COMPILE] UNTIL ; IMMEDIATE
HEX
META DEFINITIONS
: ASSEMBLE ( -- ADR TRUE )
LATEST TRUE OVER TSB
[ASSEMBLER] ASSEMBLER MEM
!CDP ;
HOST
META DEFINITIONS
: CODE ( -- ADR TRUE )
HEADER THERE 2+ V,
ASSEMBLE ;
: END-CODE ( ADR TRUE -- )
?CDP [COMPILE] END-CODE ;
IMMEDIATE
HOST
Now to build some CODE words in virtual memory. Here is the session log.
Code:
OK
VOCS
FORTH
META
SHADOW
FORTH
ASSEMBLER
EDITOR
ASSEMBLER
OK
ORDER
CONTEXT: FORTH
CURRENT: FORTH
OK
START OK
1024 ORIGIN OK
CODE TESTWORD OK
0 ,X LDA 1 ,X ORA OK
0= IF DEY THEN OK
RTS OK
END-CODE OK
HOST OK
TARGET OK
TESTWORD
TESTWORD
^^^^^^^^
COUNTERPART NOT FOUND
Typing TESTWORD shows that a word called TESTWORD was not defined in the SHADOW , META , or host FORTH vocabularies.
Code:
HOST OK
$FB CONSTANT IP OK
$FE CONSTANT W OK
TARGET OK
CODE LIT OK
IP )Y LDA PHA IP INC OK
0= IF IP 1+ INC THEN OK
IP )Y LDA IP INC OK
0= IF IP 1+ INC THEN OK
DEX DEX OK
1 ,X STA PLA 0 ,X STA OK
1 # LDY OK
IP )Y LDA W 1+ STA DEY OK
IP )Y LDA W STA CLC OK
IP LDA 2 # ADC IP STA OK
CS NOT IF OK
W 1- JMP OK
THEN OK
IP 1+ INC OK
W 1- JMP END-CODE OK
' TESTWORD >BODY 6 HEX DUMP
838A B 4 0 0 0 0 0 0 E1 7E 82 49 D0 AE 24 FB KD@@@@@@...I..$.
OK
' LIT >BODY 6 DUMP
83AF 1B 4 42 A 0 0 4 44 55 4D 50 20 20 52 44 20 [DBJ@@DDUMP RD
OK
As can be seen in the hex dump for TESTWORD , its second cell holds a zero. The second cell for the target word LIT holds the address of the original LIT (a word compiled by LITERAL) in the host system.
Code:
HOST ' LIT U. A42 OK
41B >VIRTUAL DIS
CC1B B104 ,X ORA
CC1E FUTURE EXPANSION
3
OK
41B V@ U. 41D OK
41D >VIRTUAL DIS
CC1D FB )Y LDA IP
CC1F PHA
CC20 FB INC IP
CC22 CC26 BNE
CC24 FC INC IP 1+
CC26 FB )Y LDA IP
CC28 FB INC IP
CC2A CC2E BNE
CC2C FC INC IP 1+
CC2E DEX
CC2F DEX
CC30 1 ,X STA
CC32 PLA
CC33 0 ,X STA
CC35 1 # LDY
CC37 FB )Y LDA IP
CC39 FF STA W 1+
CC3B DEY
CC3C FB )Y LDA IP
CC3E FE STA W
CC40 CLC
CC41 FB LDA IP
CC43 2 # ADC
CC45 FB STA IP
CC47 CC4C BCS
CC49 FD JMP W 1-
CC4C FC INC IP 1+
CC4E FD JMP W 1-
34
OK
#1024 THERE OVER - OK
.S 400 51 OK
. 51 OK
>VIRTUAL 51 OK
.S CC00 51 OK
DUMP
CC00 0 0 88 54 45 53 54 57 4F 52 C4 D 4 B5 0 15 @@.TESTWOR.MD.@U
CC10 1 D0 1 88 60 0 4 83 4C 49 D4 1D 4 B1 FB 48 A.A..@D.LI.]D..H
CC20 E6 FB D0 2 E6 FC B1 FB E6 FB D0 2 E6 FC CA CA ...B.......B....
CC30 95 1 68 95 0 A0 1 B1 FB 85 FF 88 B1 FB 85 FE .A..@ A.........
CC40 18 A5 FB 69 2 85 FB B0 3 4C FD 0 E6 FC 4C FD X...B...CL.@..L.
CC50 0 85 FF 88 B1 FB 85 FE 18 A5 FB 69 2 85 FB B0 @.......X...B...
OK
CONSOLE
That's as far as I am in the redesign of Fleet Forth's metacompiler, so that's all for now.