6502.org

Posted: **Sun Jun 17, 2018 12:10 pm**

SamCo has kindly ported the ANSI Forth test suite to Tali Forth 2, and apart from my "duh" stupid errors, there are also some really strange edge cases that are now covered. For instance, this test of DOES> is currently failing, as the log shows:

Code: Select all

{ : weird: create does> 1 + does> 2 + ; -> }  ok
{ weird: w1 -> }  ok
{ ' w1 >body -> here } INCORRECT RESULT: { ' w1 >body -> here } ACTUAL RESULT: { 3938 } ok
{ w1 -> here 1 + }  ok
{ w1 -> here 2 + }  ok

(Yes, the "weird" part is from the official test code https://forth-standard.org/standard/core/DOES.) I did a double take when I first saw this: Yes, there are two DOES> after one CREATE. I didn't even know that was legal. As for what it, uh, does, the first call to w1 will run the first DOES>, and then all others will call the second one. You can do the same thing with three DOES> (here Gforth to be sure):

Code: Select all

: weird3 create does> 1 + does> 2 + does> 3 + ;  ok
weird3 w3  ok
w3 . 139906661999569  ok
w3 . 139906661999570  ok
w3 . 139906661999571  ok
w3 . 139906661999571  ok

I think I understand why this happens, but it does hurt my brain. More practically minded, what can it be used for? Well, a single initial code sequence for a created word, and then the same for every other call:

Code: Select all

: hi_there create does> ." Nice to meet you!" does> ." We've already met." ;  ok
hi_there ok
hi Nice to meet you! ok
hi We've already met. ok
hi We've already met. ok

(This leaves stuff on the stack, but gets the point across). It's tempting to use the word "constructor" here, but that would probably be a false analogy.

The really weird thing? This bizarre double DOES> code actually worked immediately with Tali Forth. The test is failing because of >BODY. That's a word that is the bane of my existence anyway, because with a STC Forth, there is no real difference between PFA and CFA. Tali gets around this with long, ugly code that checks to see if there is DOVAR, DOCONST etc subroutine jump at the beginning of the code area, and if yes, skips. But >BODY has no way of knowing that a word was created by a CREATE/DOES> construct, and so we're off by three bytes (the length of the subroutine jump instruction), which is why the ANSI test is failing while trying to match that address with HERE.

I currently have no idea how to fix this issue (formally https://github.com/scotws/TaliForth2/issues/61) at the moment -- ideally, it would slay the monster that is >BODY at the same time. We have three unused status bits, one of which would do the trick ("has CFA") -- clear by default, set for DODOES and friends, and words created by CREATE. But that's still ugly. At the moment, we'll document it and live with it.

Posted: **Fri Jun 22, 2018 9:54 am**

But wait, there's more! We had a weird crash during testing when then the test files were called in a certain order (see https://github.com/scotws/TaliForth2/issues/76) which SamCo traced down to a problem in my PARSE-NAME code. The weird part, though, is the line in the test code where it was crashing:

Code: Select all

: iffloored [ -3 2 / -2 = invert ] literal if postpone \ then ;

Look at the POSTPONE \ part -- what is happening here is that the comment is being postponed. You can actually do this in Forth:

Code: Select all

: madness postpone \ ; immediate  ok
: madness1 madness ." hi!" ;  compiled
;  ok
madness1  ok

This is Gforth, just to be sure. Note that madness1 is not compiled at first, because the comment is activated through the immediate word madness, and then when it is, the string printing part is not included. Ye gods.

Posted: **Fri Jun 22, 2018 1:37 pm**

I have never heard of multiple does> after a create. It might come in handy for object oriented style programming where the first invocation is a constructor, while subsequent invocations return either a pointer or execution token. But I've never had a need for it before now.

Posted: **Fri Jun 22, 2018 8:43 pm**

Martin_H wrote:

But I've never had a need for it before now.

That seems to be the case frequently in Forth, where we can do what we need to do but then someone comes in with off-the-wall stuff like this, and gradually we begin to figure out ways we can implement it to further improve our programming. I don't envision any uses for it yet, but I'm sure the time will come. Cool.

Posted: **Sun Jun 24, 2018 8:32 am**

The conditional word definition trick with POSTPONE \ is the one I'm really trying to remember. Isolating some of the code, we have

Code: Select all

: iffloored [ -3 2 / -2 = invert ] literal if postpone \ then ;
: ifsym     [ -3 2 / -1 = invert ] literal if postpone \ then ;

What this does is define words that comment out the rest of the line if the condition, calculated at compile time, is true. Then in the next step, we use this this to decide which words to define (skipping a bunch of code):

Code: Select all

iffloored : t/mod  >r s>d r> fm/mod ;
ifsym     : t/mod  >r s>d r> sm/rem ;

As a last step, we have the actual tests:

Code: Select all

{ 0 1 /mod -> 0 1 t/mod }

So, as an attempt to transfer this, let's have the user give us a number and convert it:

Code: Select all

: get_number ( -- addr u ) ." Number (1 or 2): " pad 10 accept ( u ) pad swap ;
: string>number ( addr u -- u ) 2>r 0. 2r> >number 2drop d>s ;

We set up our magic POSTPONE \ stuff:

Code: Select all

: ifone ( u -- ) 1 = invert if postpone \ then ;  ok
: iftwo ( u -- ) 2 = invert if postpone \ then ;

We use INVERT to make it easier to understand, otherwise iftwo would have the string for one, and that's harder to read. So now we use this to make a printing word:

Code: Select all

get_number string>number dup ( --> user types "1" )
ifone : .number ." It's a one!" ;
iftwo : .number ." It's a two!" ;

In the end, we just use one word:

Code: Select all

.number

Okay, this is a silly example, as CASE statement would make more sense:

Code: Select all

: .number ( u -- ) case
    1 of ." It's a one!" endof
    2 of ." It's a two!" endof
    ." Wrong number!"
endcase ;

though I think under the hood the conditional commenting uses far fewer resources than the CASE, which is really a bunch of IF statements. One way or another, this is a seriously cool trick, and I can't think of another language where you can do something like this. Lisp, maybe?

Posted: **Sun Jun 24, 2018 10:09 am**

There's some good stuff to think about there. However, I must correct this:

scotws wrote:

though I think under the hood the conditional commenting uses far fewer resources than the CASE, which is really a bunch of IF statements. One way or another, this is a seriously cool trick, and I can't think of another language where you can do something like this. Lisp, maybe?

Here's what my CASE compiles. It's much more efficient than a bunch of IF statements, because of does the comparison and the parameter stack work too, not just the conditional branching.

of, compiled by OF, is a primitive, and the compiled CFA is followed by the address where to go if the test fails. ENDOF compiles branch's CFA, also two bytes, followed by the address to branch to.

I also have RANGE_OF and SET_OF . An example of RANGE_OF would be like "If TOS is in the range of 23 to 51, then do this..." An example of SET_OF would be like "If TOS is in the set of -4, 19, 44, 45, 171, 189, 392, then do..." All these forms can be mixed in any combinations in the same CASE structure.

Posted: **Sun Jun 24, 2018 6:02 pm**

Martin_H wrote:

I have never heard of multiple does> after a create. It might come in handy for object oriented style programming where the first invocation is a constructor, while subsequent invocations return either a pointer or execution token. But I've never had a need for it before now.

Not sure how that would work since DOES> alters the CFA of the latest word in the dictionary.

Posted: **Sun Jul 01, 2018 11:02 pm**

Okay, I think I understand how the multiple DOES> is working. Possibly. Maybe. At least with Tali Forth.

A brief review (based on Brad's http://www.bradrodriguez.com/papers/moving3.htm): CREATE adds a word to the dictionary that by default creates a header in the dictionary and associated code that looks something like this:

Code: Select all

jsr DOVAR
<address>
rts

In classical Forth terms, the JSR instruction lives at the Code Field Address (CFA) and the <address> part is a the Parameter Field Address (PFA). Except that this is a Subroutine Threaded Forth (STC), and so there isn't that distinction. Instead, the Execution Token (xt) simply points at the JSR instruction, and DOVAR knows to jump over the <address> part before it returns. By default, the whole thing just pushes that address to the stack.

Now, a simple DOES> after a CREATE is an immediate word that installs two parts in our new word: A component used to define the new words (traditionally named (DOES), but not with Tali) and a JSR to DODOES, a runtime component for the newly defined words that they will later jump to. After that, the actual code after the DOES> is compiled.

So when we use a CREATE/DOES> construct to define a new word, it basically is just a jump to the DODOES routine of it's creator. This may sound a bit complicated, but is required to move stuff around on the Return Stack to get back where we started.

Now for the double DOES>. Let's start with a simple example:

Code: Select all

: aaa  create  does> drop 0  does> drop 1 ;

The DROP is required because by default, remember, we get the address back of where the code starts. If we disassemble this and strip out the CREATE part and the underflow checks, we get:

Code: Select all

20 A6 88        jsr 88A6        ; (DOES)
20 03 B7        jsr B703        ; DODOES       ; <-- BBB #1

E8              inx             ; DROP
E8              inx     

CA              dex             ; PUSH
CA              dex
74 00           stz 0,x       ; 0
74 01           stz 1,x

20 A6 88        jsr 88A6        ; (DOES)
20 03 B7        jsr B703        ; DODOES        ; <-- BBB #2

E8              inx             ; DROP
E8              inx

CA              dex             ; PUSH
CA              dex
A9 01           lda #01         ; 1
95 00           sta 0,x
74 01           stz 1,x

60              rts

This is what we would expect: Subroutine jump to (DOES), subroutine jump to DODOES, and the actual code ... which contains, of course, the same thing again, but with 1 instead of 0 (both are hard-coded words in Tali, inlined here automatically because they are small). As an aside, note the stack thrashing INX INX DEX DEX combination. Twice. Sigh.

Anyway. What happens when we define a new word, say BBB?

Code: Select all

aaa bbb

We start at the top, and create a new word, adding a JSR to the first (!) DODOES jump. Using SEE, we can confirm this:

Code: Select all

see bbb
nt: 19D5  xt: 19E0  NN
size (decimal): 3
19E0  20 97 19
19E0  jsr 1997

The JSR to $1997 is in fact our first DODOES, the one marked with "BBB #1". The important thing is that we have just defined BBB, and then we stop.

The real magic happens when we actually run BBB. What this does (no pun intended) is jumps to the first DODOES, just where the arrow is at "BBB #1". This runs the payload, putting 0 on the stack -- but then continues to where it hits another JSR to the runtime (DOES) word. This does what it is supposed to: Figures out which CREATE it belongs to, and changes that JSR to it's own DODOES one instruction down (marked with "BBB #2"). And then it stops. We can confirm this with another SEE:

Code: Select all

see bbb 
 nt: 19D5  xt: 19E0  NN 
 size (decimal): 3 
 19E0  20 AC 19 
19E0  jsr 19AC

Notice what has changed: Only the payload address. BBB still lives at the same spot, but when it runs, it now jumps to the second DODOES (marked "BBB #2"). Since there is no third DOES> in this case, that is now the action it takes for now and evermore.

To sum up: This is self-modifying code.

Which is sort of why Forth is the Deadpool of computer languages ...

Posted: **Sun Jul 01, 2018 11:55 pm**

That's a very good description of what's going on. I'll have to take another look at it.

In the post at the beginning of this thread, why is the test

Code: Select all

{ ' w1 >body -> here }

expecting here when >body is used on the xt for w1? Is that because there is only a jsr to dodoes compiled into w1, whose actions actually live in the defining word "weird:"? I believe >body skips over the dodoes if it recognizes it, so shouldn't that test work? It should't care if the address has been changed.

I'll have to have a sit-down session with the emulator and see if I can't figure out what this test is actually doing vs what it wants.

Posted: **Mon Jul 02, 2018 8:42 pm**

For the record, the problem was solved by adding a header status flag called "has CFA" (HC) that >BODY uses for its calculations.

Posted: **Tue Jul 03, 2018 7:49 pm**

scotws wrote:

The real magic happens when we actually run BBB. What this does (no pun intended) is jumps to the first DODOES, just where the arrow is at "BBB #1". This runs the payload, putting 0 on the stack -- but then continues to where it hits another JSR to the runtime (DOES) word. This does what it is supposed to: Figures out which CREATE it belongs to, and changes that JSR to it's own DODOES one instruction down (marked with "BBB #2"). And then it stops. We can confirm this with another SEE
To sum up: This is self-modifying code.

Which is sort of why Forth is the Deadpool of computer languages ...

This isn't quite what happens with the Forth I wrote for the Commodore 64 and it isn't quite what happens in Gforth. Here is a test run from Gforth.

Code: Select all

: weird3 create does> 1 + does> 2 + does> 3 + ;  ok
weird3 w3  ok

see w3 
create w3  
DOES> 1 + (does>) <538976288>  <-858993460>  <-871615693>  2 + (does>) <538976288>  <818688012>  <-1020261181>  3 + ; ok
w3 . 8063465  ok
w3 . 8063466  ok
w3 . 8063467  ok
w3 . 8063467  ok

see w3 
create w3  
DOES> 3 + ; ok

It looks as though w3 is self-modfying but if the test is rerun and another word is defined before w3 is run:

Code: Select all

: weird3 create does> 1 + does> 2 + does> 3 + ;  ok
weird3 w3  ok
: snafu cr ." colon definition." ;  ok
snafu 
colon definition. ok
w3 . 2259433  ok
w3 . 2259433  ok
w3 . 2259433  ok
snafu  ok
. 2259458  ok

see snafu 
create snafu  
DOES> 3 + ; ok

see w3  
create w3  
DOES> 1 + (does>) <538976288>  <-858993460>  <-871615693>  2 + (does>) <538976288>  <818688012>  <-1020261181>  3 + ; ok

weird3 w3b  ok
w3b . 2259545  ok
w3b . 2259546  ok
w3b . 2259547  ok
w3b . 2259547  ok

see w3b 
create w3b  
DOES> 3 + ; ok

w3 changes the CFA of SNAFU so it is no longer a colon definition.
Here is the test run from the Forth I wrote for the C64:

Code: Select all

 OK
 
: WEIRD3 CREATE DOES> 1 + DOES> 2 + DOES> 3 + ;  OK
WEIRD3 W3  OK

SEE W3 
W3
23110 DOES> ' WEIRD3 >BODY  7 +
 23077 1
 23079 +
 23081 (;CODE)
 23083  9206    JMP ' DOES> >BODY 17 +
 23086 2
 23088 +
 23090 (;CODE)
 23092  9206    JMP ' DOES> >BODY 17 +
 23095 3
 23097 +
 23099 EXIT OK

W3 . 23111  OK
W3 . 23112  OK
W3 . 23113  OK
W3 . 23113  OK

SEE W3 
W3
23110 DOES> ' WEIRD3 >BODY 25 +
 23095 3
 23097 +
 23099 EXIT OK

EMPTY  OK
: WEIRD3 CREATE DOES> 1 + DOES> 2 + DOES> 3 + ;  OK
WEIRD3 W3  OK
: SNAFU CR ." COLON DEFINITION." ;  OK
SNAFU 
COLON DEFINITION. OK
W3 . 23111  OK
W3 . 23111  OK
W3 . 23111  OK
SNAFU  OK
. 23124  OK

SEE SNAFU 
SNAFU
23122 DOES> ' WEIRD3 >BODY 25 +
 23095 3
 23097 +
 23099 EXIT OK

// disassemble the body of SNAFU
23122 :DIS 
 23122 CR
 23124 (.") COLON DEFINITION.
 23144 EXIT OK

SEE W3 
W3
23110 DOES> ' WEIRD3 >BODY  7 +
 23077 1
 23079 +
 23081 (;CODE)
 23083  9206    JMP ' DOES> >BODY 17 +
 23086 2
 23088 +
 23090 (;CODE)
 23092  9206    JMP ' DOES> >BODY 17 +
 23095 3
 23097 +
 23099 EXIT OK

WEIRD3 W3B  OK
W3B . 23157  OK
W3B . 23158  OK
W3B . 23159  OK
W3B . 23159  OK

SEE W3B 
W3B
23156 DOES> ' WEIRD3 >BODY 25 +
 23095 3
 23097 +
 23099 EXIT OK

CONSOLE

It has the same behaviour in this case as Gforth.

Posted: **Thu Jul 05, 2018 5:30 pm**

That does appear to be the correct behavior. According to https://forth-standard.org/standard/core/DOES it is supposed to replace the execution semantics of the most recent definition. The testing section on that same page even shows using some words with DOES> in them to modify a more recently defined word.

6502.org

Forth is officially weird (multiple DOES> with CREATE)

Forth is officially weird (multiple DOES> with CREATE)

Re: Forth is officially weird (multiple DOES> with CREATE)

Re: Forth is officially weird (multiple DOES> with CREATE)

Re: Forth is officially weird (multiple DOES> with CREATE)

Re: Forth is officially weird (multiple DOES> with CREATE)

Re: Forth is officially weird (multiple DOES> with CREATE)

Re: Forth is officially weird (multiple DOES> with CREATE)

Re: Forth is officially weird (multiple DOES> with CREATE)

Re: Forth is officially weird (multiple DOES> with CREATE)

Re: Forth is officially weird (multiple DOES> with CREATE)

Re: Forth is officially weird (multiple DOES> with CREATE)

Re: Forth is officially weird (multiple DOES> with CREATE)