and the reason has nothing to do with the chaining of the code fields together until they are resolved. That part works as expected.
So that I can use variables like this:
CODE ] ( -- )
DEY
STATE STY
STATE 1+ STY
NEXT JMP END-CODE
there has to be a word with the same name as the variable which returns the address of the variable. This is why there is a SHADOW vocabulary and why the DEFINER word CREATE creates a shadow constant for each CREATE word. This was also the case before redesigning the metacompiler.
This normally works quite well.
The word 1 is a code word with no body in Fleet Forth. Its code field points to a place in (FIND) which pushes a one on the data stack. This is smaller than a CONSTANT with the name 1 .
At this point there is a VARIABLE named 1 and a SHADOWCONSTANT of the same name. When 1 is found while interpreting, such as assembling a code word with the meta assembler, the address of this VARIABLE will be placed on the stack by the SHADOWCONSTANT1.
The following will change 1 to a code word with no body.
The shadow definition of 1 will still exist and will place the address of what was the VARIABLE1 on the data stack whenever the number 1 is encountered.
no shadow constant will be created since they are not needed for code words. In the following example the host system's 1 will be found or the search string will be converted to a number. Either way a one will still be placed on the stack as intended.
The host system's version of -; would work with the metacompiler; however, there would be no check for a change in the size of the host dictionary.
Metacompiling colon definitions which end with ; would not work without the metacompiler's version. It is needed to compile the target version of EXIT .
The only difference between the metacompiler's version of hyphen semicolon and the host version of hyphen semicolon is the check to see if the host dictionary changed. The metacompiler's hyphen semicolon can be streamlined.
The host version of hyphen semicolon is used in the metacompiler's version of hyphen semicolon.
The metacompiler's version of hyphen semicolon is used in the metacompiler's version of semicolon.
Only a few more words are needed to make the metacompiler functional. SMUDGE-TEST checks the TARGETVOCABULARY for any smudged words. If any are smudged, something went wrong while metacompiling. Smudging a word by setting its smudge-bit cannot hide it from WITH-WORDS .
'THERE (tick there) points to the address in virtual memory the target will use for the coldstart value of DP , Forth's dictionary pointer.
The address in WLINK is stored in the body of the vocabulary FORTH in the target. The target's CURRENT and CONTEXT are also set to point to the target's FORTH . Setting the target's CONTEXT and CURRENT isn't strictly necessary since the target's cold start routine executes FORTH DEFINITIONS as part of the high level startup. I left this in so the new kernel image will be identical to the one created with the older metacompiler.
Finally, the target's VOC-LINK is set to point to the third cell in the target's FORTH .
FINISH-FORTH executes PATCH-FORTH , runs a few checks and switches off metacompiling.
: FINISH-FORTH ( -- )
PATCH-FORTH VERIFY
[TARGET] FORTH [HOST]
SMUDGE-TEST
[COMPILE] FINISH
[COMPILE] META-STATUS
'THERE 0= ABORT" 'THERE IS ZERO"
USER.AREA 0= ABORT" NO USER AREA"
; IMMEDIATE
Unless I overlooked mentioning something in the source, that will be everything needed for metacompilation. I've tested this metacompiler with the source for Fleet Forth to build a new kernel. The new kernel is identical to the original so by that measure the new metacompiler is a success.
Now to start metacompiling by inserting the first source disk.
FH is 'from here' and adds the current blocks number to the number on the stack. Since the load block is block 1, this will load blocks 2 thru 165 inclusive.
When FINISH-FORTH executes, this is displayed:
That's about it. There are two more words for the metacompiler. GLOSSARY displays all the words in a given vocabulary, each on its own line with some statistical data. TLOCATE is used to locate the source for a word in the TARGET vocabulary.
: GLOSSARY ( -- )
SETWIDTH WITH-WORDS
CR DUP ID. TAB NAME>
DUP @ MACRO-CF =
IF >BODY COUNT QTYPE EXIT THEN
DUP @ TARGET-WORD-CF =
IF
>BODY DUP @ 16BITS U.R
." :" 2+ @ U.W EXIT
THEN
DUP @ MUSER-CF =
IF >BODY C@ 8BITS .R EXIT
THEN
>BODY @ U.W ;
GLOSSARY will display a MACRO's internal string.
It will display the CFA of a target word and the count of how many times that word was compiled.
If a word is the shadow of a USER VARIABLE , GLOSSARY will display its offset into the user area.
For all other words, GLOSSARY displays the contents of the first cell of the word's PFA.
This switches to the EDITOR vocabulary and displays the screen with the source for ?STACK .
Those are all the words for the metacompiler. I recently made changes to Fleet Forth's kernel and built the new kernel with the new metacompiler. Everything seems to work fine.
Any questions?
A few more words I glossed over.
I used ORIGIN in some of the examples. It takes a number from the data stack and makes that the value for THERE by setting TDP , target DP. The lowest value used for ORIGIN is retained in (ORIGIN) and used as the start address when saving the target with TSAVE .
I mentioned that the metacompiler has the following vocabulary structure:
FORTH
META
SHADOW
FORTH
ASSEMBLER
EDITOR
ASSEMBLER
I already showed the source for some words to navigate this vocabulary tree. The following seven words are immediate. HOST makes the host FORTH vocabulary the CONTEXT and CURRENT vocabulary. TARGET makes the target FORTH vocabulary the CONTEXT and CURRENT vocabulary.
The following words only set the CONTEXT vocabulary. [HOST] sets it to the host FORTH vocabulary. Note the brackets. [TARGET] sets it to the target FORTH vocabulary. [META] sets it to the META vocabulary. [SHADOW] sets it to the SHADOW vocabulary. [ASSEMBLER] sets it to the metacompiler's ASSEMBLER vocabulary.
These seven words are defined in the host FORTH vocabulary so they can be found from any other vocabulary.
The variable HEADLESS is used to count the number of headerless words. PNAMES is used to count the number of padded words.
The constant DEFINER-WORD is used by the word DEF-RESET to reset the definer words. It is also used by VERIFY to show which definer words were not defined in the target source. It is also used by PATCH-CF to check if the latest word defined in the target is actually a definer word.
The constant TARGET-WORD-CF is used to determine if a word is a target word. TARGET-WORD-CF and the constants MACRO-CF and MUSER-CF are used by GLOSSARY to display appropriate data for target words, macros and the shadows of user variables.
: HSUBR ( -- )
META?
IF
THERE CONSTANT ASSEMBLE EXIT
THEN
SUBR ;
If not metacompiling, it acts as an alias for SUBR , a word in the host system. This is so that even subroutines defined with HSUBR can be tested on the host system.
If metacompiling, it will create a constant in the TARGET vocabulary which holds the current address of THERE , target HERE . It then switches to assembling.
No header or code field is created.
The subroutine DOUBLE.COMPARE is used by four code words. This example just shows one.
DOUBLE.COMPARE has no header and no code field. Its address is found by using the constant DOUBLE.COMPARE in the TARGET vocabulary which is on the host.
If either HEAD or HEADS are ON then the word being defined in virtual memory will have a header. HEAD is normally used to switch headers off for one word at a time. HEADS is normally zero to allow some headerless words and true to forbid headerless words. NH is used just before a word is defined and is a shortcut for the phrase HEAD OFF .
This behavior is not necessary.
Headerless words can be prevented by redefining NH as a no-op just before metacompiling.
Once this is done and the new target is saved, the dictionary can be pruned back with EMPTY . This will expose the original NH and make the system ready for metacompilation again by removing all the TARGET and SHADOW words except the vocabularies defined in those vocabularies.
The same kernel can then be rebuilt with headerless words enabled to compare the size difference.
Given this realization, HEADS would appear to be unnecessary; however, I have a better use for it.
First, I redefine VHEADER and HEADER .
The line just before IF in the source for VHEADER is changed from this:
HEADS is now the default flag for header creation. NH switches HEAD off to prevent a word from having a header. It is then set to the value of HEADS .
A few more words will offer some interesting possibilities. NH and friends.
: CH ( -- ) // CREATE HEADER
HEAD ON ;
: NH ( -- ) // NO HEADER
HEAD OFF ;
: HEADERS ( -- )
HEADS ON CH ;
: NOHEADERS ( -- )
HEADS OFF NH ;
CH will force the creation of a header for the following word in the source just as NH will force the absence of a header for the following word in the source. HEADERS will make creating headers the default. This is overridden for individual words with NH . NOHEADERS will make not creating headers the default. This is overridden for individual words with CH .
By placing NOHEADERS at the beginning of the target source for each disk of source, an entire Forth kernel can be made headerless. CH can be used to cause a few select words to have headers.
Does this bring to mind any interesting possibilities?
[Edit: fixed a typo]
Last edited by JimBoyd on Tue Feb 28, 2023 2:01 am, edited 1 time in total.
The following is not specifically regarding a metacompiler, rather a comment on your words to direct whether or not headers are compiled:
In my '816 Forth kernel assembly-language source, I used the assembler variables HEADERS? and OMIT_HEADERS. HEADERS? gets turned on and off for local use, whereas OMIT_HEADERS is to be turned on only if you want to do the whole thing without headers (which, besides saving memory, would mean that the target computer cannot do its own compilation or assembly), or at least large sections that may have multiple places saying NO_HEADERS...<some code here>...HEADERS. This was mainly so if you want a totally headerless version, you don't have to comment-out all the invocations of the HEADERS macro which turns on the HEADERS? assembler variable.
Later I've thought I should change this to a stack-based thing (again, in the assembler, not the target computer) so you could nest more levels and still have it remember, when it gets to the end of a section, whether it was supposed to go back to laying down headers or not, rather than just automatically turning headers generation back on. The variables are used by the HEADER (no 'S' or '?' on the end) macro which normally automates the creation of each header but does nothing if it's not supposed to be creating a header at the time. I don't remember at the moment what the situation was that told me I should do this; but a problem situation I'm imagining now is where an INCLude file turns the creation of headers off and on but has no memory of what the situation should be when it finishes and returns assembly to the file that called it. Another is where you might have one or more words you want in a NO_HEADERS...HEADERS section, and if you decide to move it to a different part of the file, or even to an INCLude file, you won't have to see if you'll need to modify it.
I'm slowly laying out a board for the 65816 computer that will use this Forth. Up to now I've only run this Forth on my 65802 which is an '816 which drops into an '02 socket, meaning it gives almost all the advantages of the '816 minus the ability to address anything outside bank 0 (ie, the first 64KB). When the true '816 is up, I'll get back on the '816 Forth and possibly implement this headers-on/off stack to keep track of whether or not headers should be created at any given point in the code. Most of what I need to do for it is just finish and test the material to address data in other banks.
: HEADER@ ( -- F )
HEADS @ ;
: HEADER! ( F -- )
DUP HEADS ! HEAD ! ;
Since these words are not intended to be used in the middle of a word's definition, there shouldn't be anything getting in the way on the metacompiler's data stack. It's far more likely that things getting in the way would be on the auxiliary data stack, at least the way I tend to write code like Fleet Forth's kernel.
CODE ?EXIT ( F -- )
INX INX
$FE ,X LDA $FF ,X ORA
0= IF CS>A
0= NOT IF CS>A END-CODE
CODE 0EXIT ( F -- )
INX INX
$FE ,X LDA $FF ,X ORA
0= NOT IF CS>A
0= IF CS>A END-CODE
CODE ?LEAVE ( F -- )
INX INX
$FE ,X LDA $FF ,X ORA
0= IF CS>A
0= NOT IF CS>A END-CODE
CODE (LOOP)
PLA TAY INY
0= IF // BRANCHING OUT OF WORD
SEC PLA 0 # ADC
VS IF // BRANCHING OUT OF WORD
VS NOT IF
A>CS A>CS THEN CS-SWAP
LABEL LEAVE.BODY
PLA PLA
THEN
PLA PLA
A>CS A>CS THEN
A>CS A>CS THEN
LABEL EXIT.BODY
PLA IP STA PLA IP 1+ STA
THEN THEN THEN CS-SWAP CS>A CS>A
LABEL NEXT
1 # LDY
IP )Y LDA W 1+ STA DEY
IP )Y LDA W STA CLC
IP LDA 2 # ADC IP STA
CS NOT IF
W 1- JMP
THEN
IP 1+ INC
W 1- JMP END-CODE
Here is an example of using HEADER@ and HEADER! . Fleet Forth has a word, SR/W , to read and write individual disk sectors. Here are the headerless words used by SR/W .
By placing NOHEADERS at the beginning of the target source for each disk of source, an entire Forth kernel can be made headerless. CH can be used to cause a few select words to have headers.
Although that will work, it's not necessary. Since RESUME does not affect the building of headers, NOHEADERS would only need to be at the beginning of the target source after START on the first source disk. START makes header building the default. RESUME does not affect heading building.
Likewise, if the target source were in an ordinary text file and there were include files, just including the include files would not affect header building. Each include file could change whether headers are built with the words I've mentioned.
Redefining HEADERS and NOHEADERS to reclaim four bytes.
When interpreting, " returns the address of a counted string. The Ansi Forth word S" returns the address of a string and its count.
Even with this taken into consideration, this example doesn't work with one of the versions of Gforth on my computers without further modification.
My Forth treats the value of >IN as unsigned. Any value of >IN greater or equal to the size of the text stream causes WORD to return a size of zero without causing an error.
My Forth's WORD uses a the word 'STREAM to obtain the address of the current location in the text stream and its remaining length.
: 'STREAM ( -- ADR N )
BLK @ ?DUP
IF
BLOCK B/BUF
ELSE
TIB #TIB @
THEN
>IN @
OVER UMIN /STRING ;
OVER UMIN clips the value fetched from >IN so it doesn't exceed the size of the text stream. UMIN returns the unsigned minimum of two numbers. /STRING is a word from Ansi Forth which I found to be quite useful.
/STRING “slash-string” STRING
( c-addr1 u1 n – – c-addr2 u2 )
Adjust the character string at c-addr1 by n characters. The
resulting character string, specified by c-addr2 u2, begins
at c-addr1 plus n characters and is u1 minus n characters long.
There are a few places in the source for Fleet Forth where I use EVALUATE .
The following fills in the table holding the default vector for all the deferred words defined in Fleet Forth's kernel. This is done after the vectors for the deferred words have been set.
START.I&F
" TARGET DUP @ @ OVER 2+ ! 4 +
END.FORGET OVER 2+ U<
HOST >IN !"
COUNT EVALUATE TARGET
DROP
I also use EVALUATEd strings to automate defining parameters to use while building the kernel. For example, in the source for the kernel there is the following:
That number is the largest number of blocks a dual disk version of the 1581 drive could hold, if such a device existed. The EVALUATEd strings which follow round that number up to the closest power of 2 and set other parameters used to define words such as R/W , RAM and DR+ .
The new metacompiler's first big test was metacompiling an existing version of Fleet Forth. Since I'm using VICE instead of a real Commodore 64, the disks are actually disk images and the disk image with the new kernel was identical to the one created with the original metacompiler. I also have SEEALL to test kernels I build. Successfully loading the system loader and the utilities is a test. SEEALL showing the expected disassembly/decompilation is another.
: >META ( -- )
ORDER@
META DEFINITIONS
CO ORDER! ;
: >SHADOW ( -- )
ORDER@
[SHADOW] SHADOW DEFINITIONS
CO ORDER! ;
The word >META saves the values of CONTEXT and CURRENT to the auxiliary stack then changes the search and compile vocabularies to META . CO causes the rest of >META to run after >META's caller exits. That last part of >META restores the CONTEXT and CURRENT vocabularies. >SHADOW does the same thing for the SHADOW vocabulary.
This source