Does EXPECT store a zero at the end of the text?

IamRob · Post by **IamRob** » Sun Dec 10, 2023 6:52 am

I see on some Forths that there is no zero being stored at the end of the input in the TIB. And that also the TIB is not zeroed out before any input is entered.

During Interpret, which calls WORD, I also don’t see where >IN is compared with SPAN, to end the interpretation.

On another Forth I am working on, Interpret will give strange results on the input line without the ending zero. On the other hand, the ending zero is undesirable when transferring text to a screen or to a text file from the input line.

I am wondering then, what am I not seeing?

If a shorter input is entered into TIB, and TIB has not been zeroed out, then it stands to reason that some existing text from a previous entry still exists in TIB. What is stopping Interpret from going beyond SPAN and trying to interpret old text?

GARTHWILSON · Post by **GARTHWILSON** » Sun Dec 10, 2023 7:37 am

Do the Forths in question copy SPAN to #TIB?

IamRob · Post by **IamRob** » Sun Dec 10, 2023 10:19 am

No, neither of the other Forths I am looking into use #TIB.

I have kind of isolated it down to ENCLOSE, which is part of WORD. It apparently copies each word from TIB to HERE, one at a time, then interprets from there. So to end the INTERPRET phase, ENCLOSE leaves a zero at HERE to signify the end of interpreting TIB.

ENCLOSE is obviously a primitive, so following it to see why it leaves that zero. It is not obvious at first glance.

JimBoyd · Post by **JimBoyd** » Sun Dec 10, 2023 8:21 pm

It looks like the Forths you are referencing are based on the FIG Forth implementation model. As far as I know there was not an actual FIG Forth standard. The implementation model served as a de facto standard so I would not be surprised to see different FIG Forth implementations doing things differently.
To answer the question in the title of this thread, in the Forth-83 Standard EXPECT does not store a zero at the end of the text. The interpreter uses QUERY , which uses EXPECT to fill the text input buffer.

Quote:

QUERY -- M,83
Characters are received and transferred into the memory area
addressed by TIB . The transfer terminates when either a
"return" is received or the number of characters transferred
reaches the size of the area addressed by TIB . The values
of >IN and BLK are set to zero and the value of #TIB is set
to the value of SPAN . WORD may be used to accept text from
this buffer. See: EXPECT "input stream"

JimBoyd · Post by **JimBoyd** » Sun Dec 10, 2023 8:32 pm

I've previously mentioned how my Forth handles the text stream.
There have since been a few changes to improve efficiency, but it works the same.

JimBoyd · Post by **JimBoyd** » Thu Jan 04, 2024 12:14 am

JimBoyd wrote:

I've previously mentioned how my Forth handles the text stream.
There have since been a few changes to improve efficiency, but it works the same.

Here is an up to date description of how my Forth handles the text stream. There is no terminating zero.

BruceRMcF · Post by **BruceRMcF** » Thu Jan 04, 2024 3:30 pm

IamRob wrote:

No, neither of the other Forths I am looking into use #TIB.

I have kind of isolated it down to ENCLOSE, which is part of WORD. It apparently copies each word from TIB to HERE, one at a time, then interprets from there. So to end the INTERPRET phase, ENCLOSE leaves a zero at HERE to signify the end of interpreting TIB.

ENCLOSE is obviously a primitive, so following it to see why it leaves that zero. It is not obvious at first glance.

Yes, this was a common WORD based approach to the outer-interpreter ... since WORD parsed a delimited string as a char-counted string, it needed a transient buffer to receive the parsed string, and in fig-Forth the transient buffer was arranged to be where the word name needed to be for a new dictionary entry if a new word was being defined, so a second copy was not needed when WORD was used to get the name of a word by the ":" word. Since that style of outer-intepreter is using BL as the delimiter handed to WORD, it would not be unusual for a TIB to be blank terminated rather than NUL terminated.

More modern Forths tend to do the outer-interpreter parsing returning a pointer into the original source, with the count in either a variable or on the stack. They don't use WORD in their outer-interpreter, though they might include it for compatibility with user code that relies on it. In many cases, the blank-delimited parser word accepts any character code $00-$1F as white space, in addition to the space character, which avoids problems with interspersed tabs and simplifies working with systems that deliver nul-delimited input strings.

JimBoyd · Post by **JimBoyd** » Sun Jan 07, 2024 8:16 pm

IamRob wrote:

No, neither of the other Forths I am looking into use #TIB.

I have kind of isolated it down to ENCLOSE, which is part of WORD. It apparently copies each word from TIB to HERE, one at a time, then interprets from there. So to end the INTERPRET phase, ENCLOSE leaves a zero at HERE to signify the end of interpreting TIB.

ENCLOSE is obviously a primitive, so following it to see why it leaves that zero. It is not obvious at first glance.

It would be helpful if you told us the names of these Forths. I presented how my own Fleet Forth handles the text stream and in a recent post I presented how 64Forth, a FIG Forth for the Commodore 64, handles the text stream.

IamRob · Post by **IamRob** » Mon Jan 08, 2024 8:33 am

So far I have collected 14 Forth's written for the Apple II computer, which most are early varieties based on '78 and '79 Forth's. There are 4 that would be considered more advanced, with 3 following FigForth, and a recent one I just came across, called UniForth, follows, or tries to follow the '83 Forth standard.

The one I had been working up til now is called ProForth, has a little more flexibility than the other 2 FigForth types, called Mad Apple Forth and QForth.

UniForth has a lot of potential, but too many words that handle the compiling, are created using the Forth language and not primitives. EXPECT is one of those words, and attempts at making it into a primitive seem to break. But there are also many words that are primitives, that seem to have a useless function. But UniForth does have some good ones. Like a fairly full featured Assembler. And, a not too bad screen editor written in Forth, but with poor Control key choices for functionality.

Back to the zero at the end of words, after EXPECT has input a word definition, then when following INTERPRET, it isn't clear how INTERPRET ends. I thought at first the semi-colon was ending a line, but variables and constants don't have semi-colons. There also doesn't seem to be an end with a RETURN char. And EXPECT doesn't store a zero at the end of its input, nor is the buffer cleared with zeroes before EXPECT is called.

I thought I had uploaded a disk image with UniForth on this forum as I think someone might be able to make use of either some of the Forth words or a pretty good Assembler written in 6502 assembly.

BruceRMcF · Post by **BruceRMcF** » Tue Jan 09, 2024 2:48 am

IamRob wrote:

... Back to the zero at the end of words, after EXPECT has input a word definition, then when following INTERPRET, it isn't clear how INTERPRET ends.

According to the fig Forth glossary I found online, INTERPRET is the fig Forth outer interpreter, so it follows from that, it would be an endless loop, like any normal Forth outer intepreter ... barring encountering a word that cannot be interpreted or an abort or crash, it would never end. You parse a word, depending on state and whether it is immediate you compile it or interpret it, if it is not in the dictionary you convert it into a number, then you get the next one. If along the way the text input buffer is emptied, you fill it again, including a prompt for more input if in interpret state.

";" finishes up the definition, compiling EXIT and clearing the smudge bit, and setting state to interpret, but then INTERPRET would continue with the next token encountered.

IamRob · Post by **IamRob** » Tue Jan 09, 2024 6:37 pm

BruceRMcF wrote:

IamRob wrote:

... Back to the zero at the end of words, after EXPECT has input a word definition, then when following INTERPRET, it isn't clear how INTERPRET ends.

";" finishes up the definition, compiling EXIT and clearing the smudge bit, and setting state to interpret, but then INTERPRET would continue with the next token encountered.

This is what doesn't make sense though. Variables and constants don't end with a ";", so when interpreting TIB, there needs to be something to indicate end of input.

All the Fig Forths have been programmed with a hidden NULL word to end the input, which is why a zero is put at the end of the TIB. None of the Fig Forths have the word "SPAN" in them nor #TIB. On Fig Forths, EXPECT actually copies the ending zero into the TIB. The problem is, I don't know of a way to program EXPECT to know if it is supposed to include the zero or not. EXPECT actually uses a separate buffer which then uses CMOVE to copy the input to the TIB or to a memory address (like when storing strings). Normally, screens are edited using an external screen editor to Forth and doesn't have this "zero" problem. I am hoping to use EXPECT to edit my screens, but a zero in the middle of a screen will end INTERPRET and disregard the rest of the screen.

UniForth, which is based on '83 Forth, is the first Forth that I have come across that has a Forth editor and has both SPAN and #TIB, but I am not seeing where INTERPRET comes across them to end interpreting in the TIB, nor is there a zero to end interpreting. I was hoping to understand how INTERPRET uses either of these words, so I can then hopefully eliminate the zero in my Fig Forths.

With this all being said, say a fairly long line has been entered into the TIB. And that line has been interpreted. Then the next line is shorter. Some of the remnants of the previous line would be left in the TIB. Without TIB first being cleared or a zero stored at the end of input, or a direct comparison is made to SPAN or #TIB for length of the input, like I said, I am not seeing where INTERPRET is ending.

I was hoping to understand how UniForth does its INTERPRET so that its TIB does not require the zero.

BruceRMcF · Post by **BruceRMcF** » Wed Jan 10, 2024 12:17 pm

IamRob wrote:

BruceRMcF wrote:

IamRob wrote:

... Back to the zero at the end of words, after EXPECT has input a word definition, then when following INTERPRET, it isn't clear how INTERPRET ends.

";" finishes up the definition, compiling EXIT and clearing the smudge bit, and setting state to interpret, but then INTERPRET would continue with the next token encountered.

This is what doesn't make sense though. Variables and constants don't end with a ";", so when interpreting TIB, there needs to be something to indicate end of input.

But the end of a definition has nothing to do with the end of a line.

Code: Select all

: this. 1 . ;  : that. 2 . ;  : theother. 3 . ;

... is perfectly well formed forth. Not including anything after the ";" on a line is just a convention, it is not syntax.

";" ends the definition and changes the state back to interpret mode. Variables, constants, ":" and other defining words need more input because they take the name they are defining from the input stream, but regular compiling words like ";" or "IF" don't care if anything is following them or not.

IamRob wrote:

... UniForth, which is based on '83 Forth, is the first Forth that I have come across that has a Forth editor and has both SPAN and #TIB, but I am not seeing where INTERPRET comes across them to end interpreting in the TIB, nor is there a zero to end interpreting. I was hoping to understand how INTERPRET uses either of these words, so I can then hopefully eliminate the zero in my Fig Forths.

I'm more familiar with CamelForth than UniForth, so I'll say how it does it. I was wrong when I called INTERPRET an endless loop, it is QUIT that is an endless loop, and INTERPRET is the common factor for interpreting a single chunk of text, whether an input line in TIB, a Forth expression in a string to be evaluated with EVALUATE or in a BLOCK.

CamelForth INTERPRET works with a BEGIN / WHILE / REPEAT loop, where the loop control is:
... BEGIN BL WORD DUP C@ WHILE ... REPEAT ...

... which uses a Forth79/Forth83/ANSForth lineage "WORD" that returns the address of the transient buffer being used by WORD. It is WORD that detects that the TIB is exhausted when it returns a zero-length string.

Now, fig-Forth WORD does not return the address of the transient WORD buffer, but it does work by copying the parsed text into a transient buffer, so if you rewrite WORD so that it first checks whether there is nothing left to parse in the buffer, and in that case rather than parsing, you store a 16-bit zero into the WORD buffer -- one zero byte for the zero count, and one zero byte for the trailing NUL -- that might be enough for the fig-Forth INTERPRET, so that there doesn't have to be a NUL in the TIB itself.

JimBoyd · Post by **JimBoyd** » Thu Jan 11, 2024 1:03 am

BruceRMcF wrote:

According to the fig Forth glossary I found online, INTERPRET is the fig Forth outer interpreter, so it follows from that, it would be an endless loop, like any normal Forth outer intepreter ... barring encountering a word that cannot be interpreted or an abort or crash, it would never end. You parse a word, depending on state and whether it is immediate you compile it or interpret it, if it is not in the dictionary you convert it into a number, then you get the next one. If along the way the text input buffer is emptied, you fill it again, including a prompt for more input if in interpret state.

FIG Forth's INTERPRET does look like an endless loop, but it isn't. When the text stream is exhausted WORD leaves a counted string at HERE . The string has a count of 1 and a single null for the text. There is such a word in the FIG Forth dictionary.

Code: Select all

 : X
   BLK   @
   IF  ?EXEC  ENDIF
   R>   DROP   ; IMMEDIATE

The name is not really X , it is the null name and when found will force INTERPRET to exit. This is how INTERPRET 'knows' when to stop interpreting.
As with the Forth-83 Standard, FIG Forth's endless loop is QUIT

Code: Select all

: QUIT
   0 BLK !
   [COMPILE] [
   BEGIN
      RP!
      CR QUERY INTERPRET
      STATE @ 0=
      IF   ." OK"   ENDIF
   AGAIN ;

QUIT uses QUERY to refill the text input buffer.

JimBoyd · Post by **JimBoyd** » Thu Jan 11, 2024 1:30 am

IamRob wrote:

UniForth, which is based on '83 Forth, is the first Forth that I have come across that has a Forth editor and has both SPAN and #TIB, but I am not seeing where INTERPRET comes across them to end interpreting in the TIB, nor is there a zero to end interpreting. I was hoping to understand how INTERPRET uses either of these words, so I can then hopefully eliminate the zero in my Fig Forths.

With this all being said, say a fairly long line has been entered into the TIB. And that line has been interpreted. Then the next line is shorter. Some of the remnants of the previous line would be left in the TIB. Without TIB first being cleared or a zero stored at the end of input, or a direct comparison is made to SPAN or #TIB for length of the input, like I said, I am not seeing where INTERPRET is ending.

I was hoping to understand how UniForth does its INTERPRET so that its TIB does not require the zero.

Since UniForth is based on the Forth-83 Standard, WORD will have access to the length of the input stream.

Code: Select all

      input stream
               A sequence of characters available to the system, for
               processing by the text interpreter.  The input stream
               conventionally may be taken from the current input device
               (via the text input buffer) and mass storage (via a block
               buffer).  BLK , >IN , TIB and #TIB specify the input stream.
               Words using or altering BLK , >IN , TIB and #TIB are
               responsible for maintaining and restoring control of the
               input stream.
               The input stream extends from the offset value of >IN to the
               size of the input stream.  If BLK is zero the input stream
               is contained within the area addressed by TIB and is #TIB
               bytes long.  If BLK is non-zero the input stream is
               contained within the block buffer specified by BLK and is
               1024 bytes long.   See:  "11.8 Input Text"

For reference, here is the source for my Forth's WORD

Code: Select all

2VARIABLE HISTORY
: WORD  ( C -- HERE )
   'STREAM                  \ Return address and count of remaining input stream.
   BLK 2@ HISTORY 2!        \ Save the contents of BLK and >IN to HISTORY.
   DUP >IN +!               \ Use count from 'STREAM to push >IN past input stream.
   2PICK SKIP               ( delimiter address count )
   ROT 2PICK -ROT SCAN      ( address address2 count2 )
   1- 0 MAX NEGATE >IN +!   \ Use count from SCAN to pull >IN to correct offset.
   OVER - >HERE ;

and the source for 'STREAM , a non standard word to return the unprocessed portion of the input stream ( the portion from >IN to the end of the input stream).

Code: Select all

: 'STREAM  ( -- ADR N )
   BLK @ ?DUP
   IF
      BLOCK B/BUF
   ELSE
      TIB #TIB @
   THEN
   >IN @
   OVER UMIN /STRING ;

In 'STREAM the word /STRING adds >IN to the address of the input stream and subtracts >IN from the length of the input stream. This is how WORD can know when the input stream is exhausted without a trailing zero byte.

JimBoyd · Post by **JimBoyd** » Thu Jan 11, 2024 1:53 am

BruceRMcF wrote:

I'm more familiar with CamelForth than UniForth, so I'll say how it does it. I was wrong when I called INTERPRET an endless loop, it is QUIT that is an endless loop, and INTERPRET is the common factor for interpreting a single chunk of text, whether an input line in TIB, a Forth expression in a string to be evaluated with EVALUATE or in a BLOCK.

I missed that when I posted my reply. Sorry.

BruceRMcF wrote:

CamelForth INTERPRET works with a BEGIN / WHILE / REPEAT loop, where the loop control is:
... BEGIN BL WORD DUP C@ WHILE ... REPEAT ...

My Forth does something similar.

Code: Select all

: INTERPRET
   BEGIN
      PAUSE NAME C@ 0EXIT
      HERE I/C
   AGAIN -;

BruceRMcF wrote:

... which uses a Forth79/Forth83/ANSForth lineage "WORD" that returns the address of the transient buffer being used by WORD. It is WORD that detects that the TIB is exhausted when it returns a zero-length string.

Now, fig-Forth WORD does not return the address of the transient WORD buffer, but it does work by copying the parsed text into a transient buffer, so if you rewrite WORD so that it first checks whether there is nothing left to parse in the buffer, and in that case rather than parsing, you store a 16-bit zero into the WORD buffer -- one zero byte for the zero count, and one zero byte for the trailing NUL -- that might be enough for the fig-Forth INTERPRET, so that there doesn't have to be a NUL in the TIB itself.

FIG Forth's EXPECT doesn't return the count of text it receives. It seems like IamRob wants to rewrite EXPECT so it does not store a zero at the end of the text it stores in its buffer. It could also be rewritten to return the count of received characters or a workaround could be used.

Code: Select all

DECIMAL
: QUERY
   TIB @ DUP 80 BLANKS  TIB 80 EXPECT ;

: WORD
   BLK @
   IF
      BLK @ BLOCK B/BUF
   ELSE
      TIB 80
   ENDIF
   ...    ( The rest of WORD )

Does EXPECT store a zero at the end of the text?

Does EXPECT store a zero at the end of the text?

Re: Does EXPECT store a zero at the end of the text?

Re: Does EXPECT store a zero at the end of the text?

Re: Does EXPECT store a zero at the end of the text?

Re: Does EXPECT store a zero at the end of the text?

Re: Does EXPECT store a zero at the end of the text?

Re: Does EXPECT store a zero at the end of the text?

Re: Does EXPECT store a zero at the end of the text?

Re: Does EXPECT store a zero at the end of the text?

Re: Does EXPECT store a zero at the end of the text?

Re: Does EXPECT store a zero at the end of the text?

Re: Does EXPECT store a zero at the end of the text?

Re: Does EXPECT store a zero at the end of the text?

Re: Does EXPECT store a zero at the end of the text?

Re: Does EXPECT store a zero at the end of the text?