If either HEAD or HEADS are ON then the word being defined in virtual memory will have a header. HEAD is normally used to switch headers off for one word at a time. HEADS is normally zero to allow some headerless words and true to forbid headerless words. NH is used just before a word is defined and is a shortcut for the phrase HEAD OFF .
This behavior is not necessary.
Headerless words can be prevented by redefining NH as a no-op just before metacompiling.
Once this is done and the new target is saved, the dictionary can be pruned back with EMPTY . This will expose the original NH and make the system ready for metacompilation again by removing all the TARGET and SHADOW words except the vocabularies defined in those vocabularies.
The same kernel can then be rebuilt with headerless words enabled to compare the size difference.
Given this realization, HEADS would appear to be unnecessary; however, I have a better use for it.
First, I redefine VHEADER and HEADER .
The line just before IF in the source for VHEADER is changed from this:
HEADS is now the default flag for header creation. NH switches HEAD off to prevent a word from having a header. It is then set to the value of HEADS .
A few more words will offer some interesting possibilities. NH and friends.
: CH ( -- ) // CREATE HEADER
HEAD ON ;
: NH ( -- ) // NO HEADER
HEAD OFF ;
: HEADERS ( -- )
HEADS ON CH ;
: NOHEADERS ( -- )
HEADS OFF NH ;
CH will force the creation of a header for the following word in the source just as NH will force the absence of a header for the following word in the source. HEADERS will make creating headers the default. This is overridden for individual words with NH . NOHEADERS will make not creating headers the default. This is overridden for individual words with CH .
By placing NOHEADERS at the beginning of the target source for each disk of source, an entire Forth kernel can be made headerless. CH can be used to cause a few select words to have headers.
Does this bring to mind any interesting possibilities?
[Edit: fixed a typo]
Last edited by JimBoyd on Tue Feb 28, 2023 2:01 am, edited 1 time in total.
The following is not specifically regarding a metacompiler, rather a comment on your words to direct whether or not headers are compiled:
In my '816 Forth kernel assembly-language source, I used the assembler variables HEADERS? and OMIT_HEADERS. HEADERS? gets turned on and off for local use, whereas OMIT_HEADERS is to be turned on only if you want to do the whole thing without headers (which, besides saving memory, would mean that the target computer cannot do its own compilation or assembly), or at least large sections that may have multiple places saying NO_HEADERS...<some code here>...HEADERS. This was mainly so if you want a totally headerless version, you don't have to comment-out all the invocations of the HEADERS macro which turns on the HEADERS? assembler variable.
Later I've thought I should change this to a stack-based thing (again, in the assembler, not the target computer) so you could nest more levels and still have it remember, when it gets to the end of a section, whether it was supposed to go back to laying down headers or not, rather than just automatically turning headers generation back on. The variables are used by the HEADER (no 'S' or '?' on the end) macro which normally automates the creation of each header but does nothing if it's not supposed to be creating a header at the time. I don't remember at the moment what the situation was that told me I should do this; but a problem situation I'm imagining now is where an INCLude file turns the creation of headers off and on but has no memory of what the situation should be when it finishes and returns assembly to the file that called it. Another is where you might have one or more words you want in a NO_HEADERS...HEADERS section, and if you decide to move it to a different part of the file, or even to an INCLude file, you won't have to see if you'll need to modify it.
I'm slowly laying out a board for the 65816 computer that will use this Forth. Up to now I've only run this Forth on my 65802 which is an '816 which drops into an '02 socket, meaning it gives almost all the advantages of the '816 minus the ability to address anything outside bank 0 (ie, the first 64KB). When the true '816 is up, I'll get back on the '816 Forth and possibly implement this headers-on/off stack to keep track of whether or not headers should be created at any given point in the code. Most of what I need to do for it is just finish and test the material to address data in other banks.
: HEADER@ ( -- F )
HEADS @ ;
: HEADER! ( F -- )
DUP HEADS ! HEAD ! ;
Since these words are not intended to be used in the middle of a word's definition, there shouldn't be anything getting in the way on the metacompiler's data stack. It's far more likely that things getting in the way would be on the auxiliary data stack, at least the way I tend to write code like Fleet Forth's kernel.
CODE ?EXIT ( F -- )
INX INX
$FE ,X LDA $FF ,X ORA
0= IF CS>A
0= NOT IF CS>A END-CODE
CODE 0EXIT ( F -- )
INX INX
$FE ,X LDA $FF ,X ORA
0= NOT IF CS>A
0= IF CS>A END-CODE
CODE ?LEAVE ( F -- )
INX INX
$FE ,X LDA $FF ,X ORA
0= IF CS>A
0= NOT IF CS>A END-CODE
CODE (LOOP)
PLA TAY INY
0= IF // BRANCHING OUT OF WORD
SEC PLA 0 # ADC
VS IF // BRANCHING OUT OF WORD
VS NOT IF
A>CS A>CS THEN CS-SWAP
LABEL LEAVE.BODY
PLA PLA
THEN
PLA PLA
A>CS A>CS THEN
A>CS A>CS THEN
LABEL EXIT.BODY
PLA IP STA PLA IP 1+ STA
THEN THEN THEN CS-SWAP CS>A CS>A
LABEL NEXT
1 # LDY
IP )Y LDA W 1+ STA DEY
IP )Y LDA W STA CLC
IP LDA 2 # ADC IP STA
CS NOT IF
W 1- JMP
THEN
IP 1+ INC
W 1- JMP END-CODE
Here is an example of using HEADER@ and HEADER! . Fleet Forth has a word, SR/W , to read and write individual disk sectors. Here are the headerless words used by SR/W .
By placing NOHEADERS at the beginning of the target source for each disk of source, an entire Forth kernel can be made headerless. CH can be used to cause a few select words to have headers.
Although that will work, it's not necessary. Since RESUME does not affect the building of headers, NOHEADERS would only need to be at the beginning of the target source after START on the first source disk. START makes header building the default. RESUME does not affect heading building.
Likewise, if the target source were in an ordinary text file and there were include files, just including the include files would not affect header building. Each include file could change whether headers are built with the words I've mentioned.
Redefining HEADERS and NOHEADERS to reclaim four bytes.
When interpreting, " returns the address of a counted string. The Ansi Forth word S" returns the address of a string and its count.
Even with this taken into consideration, this example doesn't work with one of the versions of Gforth on my computers without further modification.
My Forth treats the value of >IN as unsigned. Any value of >IN greater or equal to the size of the text stream causes WORD to return a size of zero without causing an error.
My Forth's WORD uses a the word 'STREAM to obtain the address of the current location in the text stream and its remaining length.
: 'STREAM ( -- ADR N )
BLK @ ?DUP
IF
BLOCK B/BUF
ELSE
TIB #TIB @
THEN
>IN @
OVER UMIN /STRING ;
OVER UMIN clips the value fetched from >IN so it doesn't exceed the size of the text stream. UMIN returns the unsigned minimum of two numbers. /STRING is a word from Ansi Forth which I found to be quite useful.
/STRING “slash-string” STRING
( c-addr1 u1 n – – c-addr2 u2 )
Adjust the character string at c-addr1 by n characters. The
resulting character string, specified by c-addr2 u2, begins
at c-addr1 plus n characters and is u1 minus n characters long.
There are a few places in the source for Fleet Forth where I use EVALUATE .
The following fills in the table holding the default vector for all the deferred words defined in Fleet Forth's kernel. This is done after the vectors for the deferred words have been set.
START.I&F
" TARGET DUP @ @ OVER 2+ ! 4 +
END.FORGET OVER 2+ U<
HOST >IN !"
COUNT EVALUATE TARGET
DROP
I also use EVALUATEd strings to automate defining parameters to use while building the kernel. For example, in the source for the kernel there is the following:
That number is the largest number of blocks a dual disk version of the 1581 drive could hold, if such a device existed. The EVALUATEd strings which follow round that number up to the closest power of 2 and set other parameters used to define words such as R/W , RAM and DR+ .
The new metacompiler's first big test was metacompiling an existing version of Fleet Forth. Since I'm using VICE instead of a real Commodore 64, the disks are actually disk images and the disk image with the new kernel was identical to the one created with the original metacompiler. I also have SEEALL to test kernels I build. Successfully loading the system loader and the utilities is a test. SEEALL showing the expected disassembly/decompilation is another.
: >META ( -- )
ORDER@
META DEFINITIONS
CO ORDER! ;
: >SHADOW ( -- )
ORDER@
[SHADOW] SHADOW DEFINITIONS
CO ORDER! ;
The word >META saves the values of CONTEXT and CURRENT to the auxiliary stack then changes the search and compile vocabularies to META . CO causes the rest of >META to run after >META's caller exits. That last part of >META restores the CONTEXT and CURRENT vocabularies. >SHADOW does the same thing for the SHADOW vocabulary.
This source
I mentioned that the metacompiler supports LABELs because the target is built in virtual memory. The version of LABEL which is used to test compile target source on the host uses VALUE's which are defined before the test code. The HOST version of LABEL was defined like this:
I was testing a modification to Fleet Forth. Some of the words branch to NEXT . This version of LABEL wasn't going to work since I needed some of the test words to branch to the address for NEXT in one of the test words. I needed a VALUE defined in the host assembler to successfully test the new code.
This is the new definition of the HOST version of LABEL .
: LABEL ( -- )
HERE ' DUP @
[ ' #BUF @ ] LITERAL <>
ABORT" NOT A LABEL"
>BODY ! ; IMMEDIATE
To test the modification to Fleet Forth on the HOST , I needed to be able to alter the address of NEXT in the HOSTASSEMBLER yet be able to EMPTY the dictionary back to its pre test state and have NEXT assume its former contents.
ASSEMBLER DEFINITIONS
NEXT VALUE NEXT
FORTH DEFINITIONS
Once a VALUE is defined for each label in the test code, the test code (including NEXT and the word which passes through it) is loaded.
Testing then commences; although, the first test was loading the source to see if all the intended branches to NEXT were within range. Fleet Forth's assembler and the metacompiler's assembler both abort with an error message if the intended branch is out of range. BO for branch offset.
Another test run called for another modification to one of the metacompiler words.
Fleet Forth has SUBR to create a word which returns its own PFA like a VARIABLE . Unlike a variable, no storage is allotted, the assembler is invoked. For example, (>FORTH) returns the address of the code used to transition from a code word to high level Forth.
SUBR ROUTINE1
<SOME ASSEMBLY>
RTS END-CODE
CODE TESTWORD
ROUTINE1 JSR
<SOME MORE ASSEMBLY>
NEXT JMP END-CODE
The metacompiler's SUBR works as expected.The metacompiler also supports HSUBRs, subroutines without headers. Since there is no header, there is no need for a code field. HSUBR creates a CONSTANT on the host containing the address of the routine in virtual memory.
The problem arose when testing target source on the host. When testing code on the host, HSUBR would just call the host assembler's SUBR . This normally works. When it didn't work was when there was a branch in range on the target because there was no header or code field to get in the way, but the test on the host didn't assemble because the branch was out of range. The header and code field got in the way. The new version of HSUBR works the same as the old when metacompiling. When testing on the host, it requires the name which follows in the text stream to be a predefined VALUE just like the host version of LABEL . Now there is no header or code field to get in the way of code during testing on the host.
: HSUBR ( -- )
META?
IF
THERE CONSTANT ASSEMBLE EXIT
THEN
[COMPILE] LABEL [ ASSEMBLER ]
W TRUE ASSEMBLER MEM ;
I found out about this problem when testing a modification for source close to NEXT . The modification should have worked. I tested it on the host before committing to building a new kernel with the metacompiler. When I did, I got the dreaded "BRANCH RANGE EXCEEDED" error.
I did not get that error when testing with the new version of HSUBR .
As I've already mentioned, for each type of CREATE DOES> or CREATE ;CODE word supported by the metacompiler, all the code fields for a given class of word are linked in a chain until the parent word of that type is defined for the target. CODE words are the only exception. I also mentioned the problem I ran into with trying to use CREATE rather than CODE for code words with no body. There is another problem which only occurs when trying to minimize the amount of virtual memory needed to build the Forth kernel. Some Commodore 64s may not have a Ram Expansion Unit. In this case, it is possible to write a version of (RR/W) to access the ram underneath the kernal ROM and I/O.
A reminder, this is how the current build of Fleet Forth maps blocks on the computer to blocks on devices.
Each disk drive "sees" the blocks in the range 0 to 2047 and none of them have the capacity for 2048 blocks. Most have considerably less.
The target built by the metacompiler loads at location $801 so the first $801 bytes of virtual memory are not used to build it. On a Commodore 64 without a Ram Expansion Unit, virtual memory could be set to start two blocks below the start of blocks used from ram by setting VOFFSET like this:
This will reduce by two the number of blocks needed for virtual memory; however, the first $800 bytes of virtual memory are mapped to blocks $3FFE - $3FFF and are no longer accessible.
There is a side effect with the version of chains which uses two cells, one for the latest link and one for the previous link. Initialization causes a read from memory location zero in virtual memory. I could have written a test to avoid this, but it was easier to just remove the unnecessary feature form the metacompiler. MEND-CHAIN and ADD-CHAIN were removed. The following words were rewritten.
: CF, ( ADR -- )
DUP @
IF @ V, EXIT THEN
2+ VADD ;
: DEFINER ( CFA -- )
>META
CREATE
, 0 , 0 ,
// CFA OF META WORD TO EXECUTE
// FLAG -- ZERO OR TARGET CFA
// CHAIN OF WORDS
[ HERE 2+ >A ]
DOES>
DUP 2+ SWAP @ EXECUTE ;
A> CONSTANT DEFINER-WORD
: PATCH-CF ( -- )
LATEST COUNT $1F AND >HERE CLIP
['] META >BODY VFIND
0= ABORT" DEFINER MISSING"
DUP @ DEFINER-WORD <>
ABORT" NOT A DEFINER WORD"
>BODY 2+ THERE OVER !
2+ @
BEGIN
?DUP 0EXIT
DUP V@ THERE ROT V!
AGAIN -;
: DEF-RESET ( -- )
>META WITH-WORDS
NAME> DUP @ DEFINER-WORD <>
IF DROP EXIT THEN
>BODY 2+ 4 ERASE ;
The code field patching portion of PATCH-CF is slightly smaller. I also improved VTRAVERSE .
The DO LOOP run-time words are not specifically mentioned in the Forth-83 Standard and could have different names. A DO LOOP run-time word could even have the same name as the word which compiles it; however, the metacompiler does not allow redefinitions in the TARGET vocabulary.
To summarize the previous posts, while metacompiling is on:
When a target word is found which is to be interpreted, because STATE is off or the word is immediate, the search continues with the parent vocabulary.
When a NON target word is found which is to be compiled, the search continues with the parent vocabulary.
Only non-immediate target words can be compiled without the use of [COMPILE] and only non-target words can be interpreted.
There could be DO LOOPs in the source before the compiling word DO is defined. Indeed, DO and the other DO LOOP compiling words might not even be defined in the source for the Forth kernel. They could just as easily be defined in the system loader which is loaded by the new kernel. Rather than encountering the immediate word DO and continuing the search in a parent vocabulary, the non-immediate run-time DO would be compiled. The DO LOOPs would not be compiled correctly.
In spite of this, there is a way to give the run-time for DO the same name. A helper word needs added to the metacompiler.
ALIAS DO
CODE (DO) ( LIMIT START -- )
...
...
END-CODE
The TARGET FORTH vocabulary on the host will now have the name (DO) but the target in virtual memory will have the name DO .
This will work; however, there is one gotcha. When the system is built on this new kernel and the metacompiler lexicon built on top of that, the metacompiler on the new system will not be able to compile DO LOOPs.
In solving this problem I came up with two different solutions which resulted in two versions of the metacompiler. They were the same except for what feature was added.
Both versions of the metacompiler worked. I tested each one by metacompiling the new kernel then building the system and metacompiler on top of that. The new metacompiler on the new system was then used to metacompile a kernel again. The kernels were identical. All four of them.
Access to the DO LOOP run-time words was still needed to build the high level portion of SEE . The run-time words are defined before the word FORTH-83 and the DO LOOP compiling words are defined after FORTH-83 . Fleet Forth's FIND searches the CONTEXT vocabulary (and parents) then, if necessary, searches the CURRENT vocabulary.