When interpreting, " returns the address of a counted string. The Ansi Forth word S" returns the address of a string and its count.
Even with this taken into consideration, this example doesn't work with one of the versions of Gforth on my computers without further modification.
My Forth treats the value of >IN as unsigned. Any value of >IN greater or equal to the size of the text stream causes WORD to return a size of zero without causing an error.
My Forth's WORD uses a the word 'STREAM to obtain the address of the current location in the text stream and its remaining length.
: 'STREAM ( -- ADR N )
BLK @ ?DUP
IF
BLOCK B/BUF
ELSE
TIB #TIB @
THEN
>IN @
OVER UMIN /STRING ;
OVER UMIN clips the value fetched from >IN so it doesn't exceed the size of the text stream. UMIN returns the unsigned minimum of two numbers. /STRING is a word from Ansi Forth which I found to be quite useful.
/STRING “slash-string” STRING
( c-addr1 u1 n – – c-addr2 u2 )
Adjust the character string at c-addr1 by n characters. The
resulting character string, specified by c-addr2 u2, begins
at c-addr1 plus n characters and is u1 minus n characters long.
There are a few places in the source for Fleet Forth where I use EVALUATE .
The following fills in the table holding the default vector for all the deferred words defined in Fleet Forth's kernel. This is done after the vectors for the deferred words have been set.
START.I&F
" TARGET DUP @ @ OVER 2+ ! 4 +
END.FORGET OVER 2+ U<
HOST >IN !"
COUNT EVALUATE TARGET
DROP
I also use EVALUATEd strings to automate defining parameters to use while building the kernel. For example, in the source for the kernel there is the following:
That number is the largest number of blocks a dual disk version of the 1581 drive could hold, if such a device existed. The EVALUATEd strings which follow round that number up to the closest power of 2 and set other parameters used to define words such as R/W , RAM and DR+ .
The new metacompiler's first big test was metacompiling an existing version of Fleet Forth. Since I'm using VICE instead of a real Commodore 64, the disks are actually disk images and the disk image with the new kernel was identical to the one created with the original metacompiler. I also have SEEALL to test kernels I build. Successfully loading the system loader and the utilities is a test. SEEALL showing the expected disassembly/decompilation is another.
: >META ( -- )
ORDER@
META DEFINITIONS
CO ORDER! ;
: >SHADOW ( -- )
ORDER@
[SHADOW] SHADOW DEFINITIONS
CO ORDER! ;
The word >META saves the values of CONTEXT and CURRENT to the auxiliary stack then changes the search and compile vocabularies to META . CO causes the rest of >META to run after >META's caller exits. That last part of >META restores the CONTEXT and CURRENT vocabularies. >SHADOW does the same thing for the SHADOW vocabulary.
This source
I mentioned that the metacompiler supports LABELs because the target is built in virtual memory. The version of LABEL which is used to test compile target source on the host uses VALUE's which are defined before the test code. The HOST version of LABEL was defined like this:
I was testing a modification to Fleet Forth. Some of the words branch to NEXT . This version of LABEL wasn't going to work since I needed some of the test words to branch to the address for NEXT in one of the test words. I needed a VALUE defined in the host assembler to successfully test the new code.
This is the new definition of the HOST version of LABEL .
: LABEL ( -- )
HERE ' DUP @
[ ' #BUF @ ] LITERAL <>
ABORT" NOT A LABEL"
>BODY ! ; IMMEDIATE
To test the modification to Fleet Forth on the HOST , I needed to be able to alter the address of NEXT in the HOSTASSEMBLER yet be able to EMPTY the dictionary back to its pre test state and have NEXT assume its former contents.
ASSEMBLER DEFINITIONS
NEXT VALUE NEXT
FORTH DEFINITIONS
Once a VALUE is defined for each label in the test code, the test code (including NEXT and the word which passes through it) is loaded.
Testing then commences; although, the first test was loading the source to see if all the intended branches to NEXT were within range. Fleet Forth's assembler and the metacompiler's assembler both abort with an error message if the intended branch is out of range. BO for branch offset.
Another test run called for another modification to one of the metacompiler words.
Fleet Forth has SUBR to create a word which returns its own PFA like a VARIABLE . Unlike a variable, no storage is allotted, the assembler is invoked. For example, (>FORTH) returns the address of the code used to transition from a code word to high level Forth.
SUBR ROUTINE1
<SOME ASSEMBLY>
RTS END-CODE
CODE TESTWORD
ROUTINE1 JSR
<SOME MORE ASSEMBLY>
NEXT JMP END-CODE
The metacompiler's SUBR works as expected.The metacompiler also supports HSUBRs, subroutines without headers. Since there is no header, there is no need for a code field. HSUBR creates a CONSTANT on the host containing the address of the routine in virtual memory.
The problem arose when testing target source on the host. When testing code on the host, HSUBR would just call the host assembler's SUBR . This normally works. When it didn't work was when there was a branch in range on the target because there was no header or code field to get in the way, but the test on the host didn't assemble because the branch was out of range. The header and code field got in the way. The new version of HSUBR works the same as the old when metacompiling. When testing on the host, it requires the name which follows in the text stream to be a predefined VALUE just like the host version of LABEL . Now there is no header or code field to get in the way of code during testing on the host.
: HSUBR ( -- )
META?
IF
THERE CONSTANT ASSEMBLE EXIT
THEN
[COMPILE] LABEL [ ASSEMBLER ]
W TRUE ASSEMBLER MEM ;
I found out about this problem when testing a modification for source close to NEXT . The modification should have worked. I tested it on the host before committing to building a new kernel with the metacompiler. When I did, I got the dreaded "BRANCH RANGE EXCEEDED" error.
I did not get that error when testing with the new version of HSUBR .
As I've already mentioned, for each type of CREATE DOES> or CREATE ;CODE word supported by the metacompiler, all the code fields for a given class of word are linked in a chain until the parent word of that type is defined for the target. CODE words are the only exception. I also mentioned the problem I ran into with trying to use CREATE rather than CODE for code words with no body. There is another problem which only occurs when trying to minimize the amount of virtual memory needed to build the Forth kernel. Some Commodore 64s may not have a Ram Expansion Unit. In this case, it is possible to write a version of (RR/W) to access the ram underneath the kernal ROM and I/O.
A reminder, this is how the current build of Fleet Forth maps blocks on the computer to blocks on devices.
Each disk drive "sees" the blocks in the range 0 to 2047 and none of them have the capacity for 2048 blocks. Most have considerably less.
The target built by the metacompiler loads at location $801 so the first $801 bytes of virtual memory are not used to build it. On a Commodore 64 without a Ram Expansion Unit, virtual memory could be set to start two blocks below the start of blocks used from ram by setting VOFFSET like this:
This will reduce by two the number of blocks needed for virtual memory; however, the first $800 bytes of virtual memory are mapped to blocks $3FFE - $3FFF and are no longer accessible.
There is a side effect with the version of chains which uses two cells, one for the latest link and one for the previous link. Initialization causes a read from memory location zero in virtual memory. I could have written a test to avoid this, but it was easier to just remove the unnecessary feature form the metacompiler. MEND-CHAIN and ADD-CHAIN were removed. The following words were rewritten.
: CF, ( ADR -- )
DUP @
IF @ V, EXIT THEN
2+ VADD ;
: DEFINER ( CFA -- )
>META
CREATE
, 0 , 0 ,
// CFA OF META WORD TO EXECUTE
// FLAG -- ZERO OR TARGET CFA
// CHAIN OF WORDS
[ HERE 2+ >A ]
DOES>
DUP 2+ SWAP @ EXECUTE ;
A> CONSTANT DEFINER-WORD
: PATCH-CF ( -- )
LATEST COUNT $1F AND >HERE CLIP
['] META >BODY VFIND
0= ABORT" DEFINER MISSING"
DUP @ DEFINER-WORD <>
ABORT" NOT A DEFINER WORD"
>BODY 2+ THERE OVER !
2+ @
BEGIN
?DUP 0EXIT
DUP V@ THERE ROT V!
AGAIN -;
: DEF-RESET ( -- )
>META WITH-WORDS
NAME> DUP @ DEFINER-WORD <>
IF DROP EXIT THEN
>BODY 2+ 4 ERASE ;
The code field patching portion of PATCH-CF is slightly smaller. I also improved VTRAVERSE .
The DO LOOP run-time words are not specifically mentioned in the Forth-83 Standard and could have different names. A DO LOOP run-time word could even have the same name as the word which compiles it; however, the metacompiler does not allow redefinitions in the TARGET vocabulary.
To summarize the previous posts, while metacompiling is on:
When a target word is found which is to be interpreted, because STATE is off or the word is immediate, the search continues with the parent vocabulary.
When a NON target word is found which is to be compiled, the search continues with the parent vocabulary.
Only non-immediate target words can be compiled without the use of [COMPILE] and only non-target words can be interpreted.
There could be DO LOOPs in the source before the compiling word DO is defined. Indeed, DO and the other DO LOOP compiling words might not even be defined in the source for the Forth kernel. They could just as easily be defined in the system loader which is loaded by the new kernel. Rather than encountering the immediate word DO and continuing the search in a parent vocabulary, the non-immediate run-time DO would be compiled. The DO LOOPs would not be compiled correctly.
In spite of this, there is a way to give the run-time for DO the same name. A helper word needs added to the metacompiler.
ALIAS DO
CODE (DO) ( LIMIT START -- )
...
...
END-CODE
The TARGET FORTH vocabulary on the host will now have the name (DO) but the target in virtual memory will have the name DO .
This will work; however, there is one gotcha. When the system is built on this new kernel and the metacompiler lexicon built on top of that, the metacompiler on the new system will not be able to compile DO LOOPs.
In solving this problem I came up with two different solutions which resulted in two versions of the metacompiler. They were the same except for what feature was added.
Both versions of the metacompiler worked. I tested each one by metacompiling the new kernel then building the system and metacompiler on top of that. The new metacompiler on the new system was then used to metacompile a kernel again. The kernels were identical. All four of them.
Access to the DO LOOP run-time words was still needed to build the high level portion of SEE . The run-time words are defined before the word FORTH-83 and the DO LOOP compiling words are defined after FORTH-83 . Fleet Forth's FIND searches the CONTEXT vocabulary (and parents) then, if necessary, searches the CURRENT vocabulary.
In solving this problem I came up with two different solutions which resulted in two versions of the metacompiler. They were the same except for what feature was added.
For reference I will refer to the version of the metacompiler without either of these solutions as the base metacompiler. It is this base upon which version A and version B are built.
First, some words in the base metacompiler were cleaned up.
The original source for PNAME and VHEADER .
: PNAME ( -- )
1 VALLOT 1 PNAMES +!
COLS ?CR ." PADDED: R"
HERE S? CR ;
: VHEADER ( >IN -- >IN )
VNAME TFIND NIP
ABORT" REDEFINITION IN TARGET"
HEAD @
IF
DUP >IN !
HERE C@ THERE + 4 +
SPLIT DROP 0=
IF PNAME THEN
WLINK VADD
VNAME C@ THERE $80 VTOGGLE
VALLOT THERE $80 VTOGGLE
1 VALLOT
EXIT
THEN
THERE 1+ SPLIT DROP 0=
IF PNAME THEN
1 HEADLESS +! ;
HOST
: PNAME ( ADR -- )
1+ SPLIT DROP ?EXIT \ If the low byte of the address is not $FF then exit.
1 VALLOT 1 PADDED +! \ Pad the virtual header and increment the padded count.
COLS ?CR ." PADDED: R" \ If not at left column perform carriage return
HERE S? CR ; \ before displaying the name of the padded word.
: VHEADER ( -- )
>IN @ VNAME TFIND NIP \ Save >IN. If the parsed name already exists
ABORT" REDEFINITION IN TARGET" \ in the target vocabulary then abort.
HEAD @ \ If HEAD is on
IF \ create header.
>IN ! \ Restore >IN to parse the name again.
HERE C@ THERE + 2+ 1+ PNAME \ Avoid the indirect jump bug.
WLINK VADD \ Create the link field.
VNAME C@ THERE $80 VTOGGLE \ Create the name field with high bit set
VALLOT THERE $80 VTOGGLE \ for count and last character.
1 VALLOT \
EXIT
THEN \ If HEAD is off
DROP 1 HEADLESS +! \ increment the count of headerless words
THERE PNAME ; \ and avoid the indirect jump bug.
PNAME now takes an address and pads the header, if necessary, to avoid the code field straddling a page boundary. VHEADER now no longer requires the value of >IN on the data stack and it returns nothing.
I will present the first solution, metacompiler A.
Here is the base metacompiler's vocabulary structure.
FORTH
META
SHADOW
FORTH
ASSEMBLER
RUN
EDITOR
ASSEMBLER
The variable ALIASES is added to keep a count of how many words have an alias.
The new word ALIAS creates a header in virtual memory and a target word (a handle) in the RUN vocabulary. ALIAS also switches off header creation in virtual memory for the next word with NH .
: ALIAS ( ++ )
ORDER@ [META] RUN DEFINITIONS \ Save the search order and
HEADER NH \ create header in RUN vocabulary.
ORDER! \ Restore the search order.
TRUE HEADLESS +! 1 ALIASES +! ; \ Correct the headless count and
\ increment the count of aliases.
CODE MCOMPILE ( -- )
>FORTH \ Shift to high level Forth.
?COMP \ Abort if not compiling.
R> DUP 2+ >R @ \ Get the CFA of next word in Forth thread containing MCOMPILE
>NAME COUNT $1F AND >HERE CLIP \ and use the name field to create a search string at HERE .
TFIND ?TARGET M, ; \ Search the TARGET FORTH vocabulary and abort if not found.
\ If found, compile to virtual memory with M, .
CODE MCOMPILE ( -- )
>FORTH \ Shift to high level Forth.
?COMP \ Abort if not compiling.
R> DUP 2+ >R @ \ Get the CFA of next word in Forth thread containing MCOMPILE
>NAME COUNT $1F AND >HERE CLIP \ and use the name field to create a search string at HERE .
TFIND TRUE <> \ If not found or if immediate
IF
DROP HERE [META] \ drop the result and use the string at HERE
[ ' RUN >BODY ] LITERAL \ to search the RUN vocabulary.
VFIND TRUE <> \ If still not found
ABORT" RUN-TIME NOT FOUND" \ abort.
THEN
M, ; \ If found, compile to virtual memory.
The modified MCOMPILE for metacompiler A still creates a search string from the name field of the next word in it's definition.
When TEST is executed, MCOMPILE will create the search string "(TEST)" at HERE .
The TARGET FORTH vocabulary is searched by TFIND .If the sought word is not found or if it is immediate, the result is dropped and the RUN vocabulary is searched. Once more, the result has to be for a word which is not immediate. VFIND does not search parent vocabularies.
This version of the metacompiler allows changing the names of run-time words with ALIAS ; however, there is one downside.
Using DO and it's run-time as an example.
ALIAS DO
CODE (DO) ( LIMIT START -- )
...
...
END-CODE
There are now two handles on the host system for the same target word in virtual memory, this is not the problem. If the name of the run-time for DO on the host system does not match either one, a new alias is needed.
ALIAS DO \ Name of run-time for new kernel. Creates handle DO in RUN vocabulary and header DO in virtual memory.
ALIAS D \ Name of run-time in host system. Creates handle D in RUN vocabulary. header creation in virtual memory suppressed.
CODE (DO) \ Name of run-time used in kernel source. Creates handle (DO) in TARGET FORTH vocabulary. header creation in virtual memory suppressed.
...
...
END-CODE
This creates three handles for the same word in virtual memory, one in the TARGET FORTH vocabulary and two in the RUN vocabulary.
There is only one header created in virtual memory for this word since ALIAS suppresses header creation in the following word.
I will now present the second solution, metacompiler B.
No new metacompiler vocabularies are added. This metacompiler's ALIAS does not create a target word. It creates a header in virtual memory. It does use NH to switch off header creation in virtual memory for the next word.
CODE MCOMPILE ( -- )
>FORTH \ Shift to high level Forth.
?COMP \ Abort if not compiling.
R> DUP 2+ >R @ \ Get the CFA of next word in Forth thread containing MCOMPILE
>NAME COUNT $1F AND >HERE CLIP \ and use the name field to create a search string at HERE .
TFIND TRUE <> \ If not found or if immediate
ABORT" RUN-TIME NOT FOUND" \ abort.
M, ; \ If found, compile to virtual memory.
With these modifications it is possible to create a kernel where each of the DO LOOP run-time words has the same name as it's compiling word. The next addition to metacompiler B will allow it to work when built on the new system.
The word :: (double colon) is an alias for the host colon. This will allow accessing colon's functionality while defining words in the META vocabulary, where colon was redefined.
These words defined in the META vocabulary are stubs. They don't get compiled or metacompiled. MCOMPILE uses their names to find the corresponding word in the TARGET FORTH vocabulary.
Recall that the new semicolon will behave just like the host semicolon when metacompiling is off.
I could have defined these metacompiler DO LOOP compiling words with COMPILE since COMPILE gets patched when metacompiling.
This version of the metacompiler works well; however, there is a slight downside. If I want to rename the run-time for ." so it has the same name as ." , I will need to define a stub for the run-time and a metacompiler version of ." . The same is true for any run-time I wish to rename.
Metacompiler C, the, hopefully, last solution to this problem.
The only differences with metacompiler B: :: (double colon) is not defined. MCOMPILE is different.
: CFA>STR ( CFA -- HERE )
>NAME COUNT $1F AND >HERE CLIP ; \ Use CFA to generate a search string
\ from a word's name field.
CODE MCOMPILE ( -- )
>FORTH
?COMP R> DUP 2+ >R @ CFA>STR \ Get CFA of next word in definition
\ to generate search string.
TFIND TRUE <> \ If the word is not found or is
IF \ immediate
DROP R@ 2- 2- BODY> CFA>STR \ Back up to CFA of the word containing
ASCII ) OVER COUNT + C! \ MCOMPILE to generate a search string
DUP DUP 1+ BL MOVE \ and modify string to be enclosed in
ASCII ( OVER 1+ C! \ parenthesis.
2 OVER +! TFIND TRUE <> \ Search the TARGET FORTH vocabulary
ABORT" RUN-TIME NOT FOUND" \ and abort if not found.
THEN
M, ; \ If found, compile to virtual memory.
CFA>STR takes the CFA of a word and generates a string at HERE which is the same as the word's name. MCOMPILE still generates a search string based on the name of the word to be metacompiled by MCOMPILE . If the search is unsuccessful or the word found is immediate, MCOMPILE will generate a search string from the name of the word which contains METACOMPILE . This search string will be the compiling word's name enclosed in parenthesis.
Using DO as an example.
Suppose the run-time for DO on the host system is still named (DO) . MCOMPILE generates the string "(DO)" and searches for this string in the TARGET FORTH vocabulary. It will be found and the address for (DO) in virtual memory will be compiled to virtual memory.
Suppose the run-time for DO on the host has a different name. Let's suppose it is something absurd like DO.RUNTIME . MCOMPILE tries to find DO.RUNTIME in the TARGET FORTH vocabulary and fails. MCOMPILE then generates a string from the name field of the compiling word, in this case DO . The string will have parenthesis added.
Here is a sample test.
FOOBAR does not exist in the TARGET FORTH vocabulary. In this example, (TESTWORD) does exist in the TARGET FORTH vocabulary. MCOMPILE tries to find the target word FOOBAR in the TARGET FORTH vocabulary. When that fails, it tries to find (TESTWORD) in the TARGET FORTH vocabulary and succeeds.
Run-time words with the same name in the host system and the source used by the metacompiler work fine.
Examples are ?BRANCH and BRANCH used by most of the control flow words.
Run-time words with different names in the host system and the source used by the metacompiler will work if the name of the run-time in the source is a parenthesized version of the compiling word's name.
Examples are (DO) compiled by DO and (LOOP) compiled by LOOP . These run-time words can be given a different name on the target being built in virtual memory by using the word ALIAS .
At present, I don't see a downside to this version of the metacompiler