6502.org • various PETTIL design considerations - Page 4

Page 4 of 8

Re: screen editor design

Posted: Tue Jun 24, 2014 4:00 pm

by chitselb

Code: Select all

Brad, that ROBODoc link is a Tiddlywiki

. This looks like a very well-organized and versatile documentation tool. I went with dumping raw wiki markup in between #ifdef 0 and #endif in my assembler source, which doesn't give me as many options on the output side. The ruby script generates the binary symbol table (contains a code field address, length/flags byte and the name of each word), a textfile with just the list of word names (I feed that to the pearson hash cruncher tool) and a json file for the tiddywiki itself

Here's the code for building the docs for PETTIL from a json file. The json file has title, tags, and text for each tiddler (say it three times fast). That's it. I had to remove one line from the tiddlywiki.info file (the tiddlyweb plugin) to get it hostable from a file section.

echo . . . . Building docs/tiddlypettil.html
mkdir -p ./build/tiddlypettil/tiddlers
cp ./docs/statictiddlers/tiddlywiki.info ./build/tiddlypettil/
cp ./docs/statictiddlers/*.tid ./build/tiddlypettil/tiddlers/
cd ./build/tiddlypettil/
tiddlywiki --load ../pettil.json >/dev/null
tiddlywiki --rendertiddler $:/core/save/all tiddlypettil.html text/plain >/dev/null

I just rewrote the outer interpreter/compiler shell and am debugging that. I had been using Scott Ballantyne's outer interpret from Blazin' Forth, but there was a lot of weird stuff. His has a word RUN in it? His code for ']' had the compiler while the code for 'INTERPRET' didn't, which seemed like a good idea until I tried compiling a multi-line colon definition and it doesn't compile anything after the first line of input.

I am also pondering <BUILDS DOES> , specifically
http://www.vintagecomputer.net/fjkraan/ ... _00-09.pdf and this http://www.vintagecomputer.net/fjkraan/ ... -index.pdf . The second paper is all about <BUILDS DOES>

a) Why was <BUILDS renamed into CREATE ?
b) Given a two-dictionary model (compiler/interpreter/editor/assembler/symbols all in a transient dictionary, headless core dictionary at bottom of memory) what's a good way to set up <BUILDS ... DOES> such that compiler time code disappears along with the transient dictionary, while child-word runtime behavior goes in the core?
c) Who ever uses <BUILDS DOES> for anything?
d) Does <BUILDS have to be the very first thing in the definition of the builder? It was in every example I could find. Then it could be coded as (rewinding the dictionary pointer and starting the definition's CFA over)

Code: Select all

LATEST NAME> DP ! 'dodoes CFA,

otherwise it becomes verbose and clunky
e) If I'm calling it <BUILDS (not CREATE), and it's no longer in the standard, can I play fast and loose with how it operates?

more on e)
I wanted a word ?: to work like the ternary operator in Ruby/C/Java.

Code: Select all

a = true  ? 'a' : 'b' #=> "a"
b = false ? 'a' : 'b' #=> "b"

The advantage of ?: is it compiles in just six bytes ( ?: affirmative noway )

Usage like

Code: Select all

( args flag ) ?: AFFIRMATIVE NOWAY MORECODE

performs identically to

Code: Select all

( args flag ) IF AFFIRMATIVE ELSE NOWAY THEN MORECODE

which has a bigger compiled footprint because of branching

Code: Select all

: ?:   ( == ) ( flag -- )
    <BUILDS ['] , ['] , DOES> 0= -2 * + @ EXECUTE ( somehow move IP beyond both words ) ; IMMEDIATE

does> would see the address in the dictionary immediately following ?: which is where the trueword is stored.
I want

Code: Select all

['] , ['] ,

to go in the transient (compile time) dictionary and

Code: Select all

0= -2 * + @ EXECUTE ;

to go in the core (runtime)

This would not work if <BUILDS performs a CREATE operation. There would be a JSR DOCREATE immediately following ?: in the core, and a new symbol table entry for TRUEWORD. The corollary to this statement is that words using <BUILDS need to perform their own CREATE. That's where I play with the conventional meaning of <BUILDS

Code: Select all

: CONSTANT   <BUILDS  CREATE ,   DOES> @ ;

And this is where I get all confused, between code fields (the JSR at the beginning of a definition) and execution tokens (the two-byte address of a code field). To distill it all down to a simple question, I suppose that would be "What if CREATE and <BUILDS were different, and <BUILD (in the typical use case) had to invoke CREATE ?"

Re: screen editor design

Posted: Thu Jun 26, 2014 1:26 pm

by Brad R

I have not read your linked PDFs, but will comment based on my own knowledge/opinion.

chitselb wrote:

a) Why was <BUILDS renamed into CREATE ?

This is half history and half speculation on my part: the original Fig-Forth <BUILDS DOES> was rather inefficient; it compiled an additional cell into the parameter field of each "child" definition. Some clever programmer figured out how to dispense with that. Having done so, someone noticed that <BUILDS and CREATE were identical except in what they compiled into the code field, and since the new DOES> changes the code field, CREATE could be used wherever <BUILDS was used. So for economy, they got rid of the redundant <BUILDS. (If you haven't read it already, my writeup on DOES> is at http://www.bradrodriguez.com/papers/moving3.htm .)

I should add that I have since found a situation -- compiling directly to Flash memory -- where CREATE cannot be used with DOES>, and I have begun putting <BUILDS (new version, not the old version) back into my own Forth implementations.

Quote:

b) Given a two-dictionary model (compiler/interpreter/editor/assembler/symbols all in a transient dictionary, headless core dictionary at bottom of memory) what's a good way to set up <BUILDS ... DOES> such that compiler time code disappears along with the transient dictionary, while child-word runtime behavior goes in the core?

The dividing line is DOES>, an IMMEDIATE word that runs at compile time. Right now it commonly compiles the execution token of (;CODE) , complies a machine code fragment, and then the compiler continues through the remaining code:

... foo foo DOES> bar bar ... is compiled to
|foo xt|foo xt|(;CODE) xt|JSR DODOES|bar xt|bar xt|

...foo foo ;CODE assembly-code ... is compiled to
|foo xt|foo xt|(;CODE) xt|assembly code|

(;CODE) is performed while executing the parent word; its action is to stuff the following address into the code field of the newly created child word, and then exit the parent word. So the child word ends with a code field pointing to the JSR DODOES, and the action of JSR DODOES is to stack the child word's parameter field address, and start the thread that follows the JSR DODOES.

What you'd need to do is have DOES> lay down the xt of (;CODE) and a pointer to the permanent dictionary, and then switch to the permanent dictionary to continue compilation:

... foo foo DOES> bar bar ... is compiled to
(transient) |foo xt|foo xt|(;CODE) xt|adrs of X|
(permanent) X: |JSR DODOES|bar xt|bar xt|

So now (;CODE), instead of stuffing the following address into the child's code field, fetches the contents of the following address, and stuffs that into the child's code field.

I hope that's clear. The revised ;CODE is left as an exercise for the student.

Quote:

c) Who ever uses <BUILDS DOES> for anything?

Lots of Forth programmers. Any time you have a common action that needs to be applied to different data, <BUILDS DOES> can provide economy and simplification. I suspect most people first encounter the use of <BUILDS DOES> when looking at the code for Forth assemblers. (See, for example, my MSP430 assembler.) I sometimes tell people that Forth is an object-oriented language, with the restriction that each object can have only one method (the DOES> code).

Quote:

d) Does <BUILDS have to be the very first thing in the definition of the builder? It was in every example I could find. Then it could be coded as (rewinding the dictionary pointer and starting the definition's CFA over)

Code: Select all

LATEST NAME> DP ! 'dodoes CFA,

otherwise it becomes verbose and clunky

Neither <BUILDS nor CREATE must be the first thing in the parent. It may, for example, be necessary to perform some action on the dictionary before creating the new dictionary header. When the logic of the code allows it, it's customary to put <BUILDS or CREATE first, simply to let the reader know that it's a defining word, but that's not a requirement.

Quote:

e) If I'm calling it <BUILDS (not CREATE), and it's no longer in the standard, can I play fast and loose with how it operates?

Yes you can. Bear in mind that I've resurrected <BUILDS for my own purposes, so there will be at least one competing alternative, but as far as the Standard is concerned, you can do whatever you want. (I would love to see <BUILDS returned to the Standard, but I'd need 80-bit floating point to express the odds of that happening.)

More to follow...

Re: screen editor design

Posted: Thu Jun 26, 2014 1:51 pm

by Brad R

chitselb wrote:

Usage like

Code: Select all

( args flag ) ?: AFFIRMATIVE NOWAY MORECODE

performs identically to

Code: Select all

( args flag ) IF AFFIRMATIVE ELSE NOWAY THEN MORECODE

which has a bigger compiled footprint because of branching

Code: Select all

: ?:   ( == ) ( flag -- )
    <BUILDS ['] , ['] , DOES> 0= -2 * + @ EXECUTE ( somehow move IP beyond both words ) ; IMMEDIATE

does> would see the address in the dictionary immediately following ?: which is where the trueword is stored.
I want

Code: Select all

['] , ['] ,

to go in the transient (compile time) dictionary and

Code: Select all

0= -2 * + @ EXECUTE ;

to go in the core (runtime)

First, you want to use ' instead of ['] inside your ?: word. What you have written will put the xt of , on the parameter stack, twice, and won't compile anything between <BUILDS and DOES>.

Second, you only use <BUILDS or CREATE when you are defining a new Forth word. What you are doing is essentially a new control structure; look at the source code for IF THEN or BEGIN WHILE REPEAT for inspiration. Following the custom of naming the run-time action with parentheses, you want something like this:

Code: Select all

: ?:  ['] (?:) ,   ' ,   ' ,  ; IMMEDIATE

First ['] gets the xt of the following word given at compile time, namely (?:) , and that is compiled into the dictionary. Then ' reads a word name from the input stream and gets its xt, and that is compiled into the dictionary. And again for the second word name in the input stream. (Note that in ANS Standard Forth you'd do the first step with POSTPONE (?:) ) This word can reside in your transient dictionary.

The run-time action is in the word (?:) which must reside in your permanent dictionary. It will look something like this:

Code: Select all

: (?:)  ( f -- )   0= NEGATE 2*  R@ + @ EXECUTE  R> 4 + >R ;

This assumes conventional use of the return stack in a direct- or indirect-threaded Forth. 0= NEGATE 2* (or 0= NEGATE CELLS in a Standard Forth, with a two's-complement environmental dependency) gives you 0 or 2. R@ + @ EXECUTE will perform AFFIRMATIVE or NOWAY while leaving nothing of its own on the stack -- that is important if AFFIRMATIVE or NOWAY consume or return stack values. Finally R> 4 + >R (in Standard Forth, R> 2 CELLS + >R) skips IP over AFFIRMATIVE and NOWAY. (I give Standard Forth examples to illustrate proper form, but this source code will never be a Standard Forth application word -- it is too dependent upon the specifics of the implementation.)

Note: I haven't tested that code; it's just off the top of my head.

Re: screen editor design

Posted: Thu Jul 03, 2014 5:47 am

by chitselb

Brad,

Thanks, that really cleared things up for me. <BUILDS always creates a word, and that's that.

Here's what the ?: code turned out to be:

Code: Select all

;--------------------------------------------------------------
#if 0
name=?:
stack=( "name1" "name2" -- )
tags=control,compiler,unimplemented
flags=immediate
Immediate word that compiles its own runtime word (?:) and two branches. The first branch is the true
branch, and the second is the false branch.  One of those is executed by (?:) at runtime.

Used in the form

```
( flag ) :? this that

is equivalent to

if this else that then

: ?:   ( "name1" "name2" -- )
    ?comp compile (?:) ' , ' , ; immediate
```
#endif
_querycolon
#include "enter.i65"
	.word _qcomp
#include "page.i65"
	.word _compile
	.word pquerycolon
#include "pad.i65"
	.word _tick
#include "page.i65"
	.word _comma
#include "page.i65"
	.word _tick
#include "page.i65"
	.word _comma
#include "page.i65"
	.word exit

;--------------------------------------------------------------
#if 0
name=(?:)
stack=( flag -- )
tags=control,inner,nosymbol
The runtime of [[?:]]
#endif
pquerycolon
    ldy #5
    lda tos
    ora tos+1                   ; evaluate the flag
    beq pquerycolon01           ; false?
    dey
    dey                         ; true
pquerycolon01
    lda (ip),y
    sta tos+1
    dey
    lda (ip),y
    sta tos                     ; branch CFA to TOS
    lda ip+1
    pha
    lda ip
    pha                         ; preserve mainline IP
#include "toforth.i65"
    .word execute               ; perform one of the branches
#include "page.i65"
    .word to6502
    pla
    sta ip
    pla
    sta ip+1                    ; restore mainline IP
    lda #6
    jmp pad                     ; skip past both branches

Re: screen editor design

Posted: Thu Jul 03, 2014 6:50 am

by chitselb

It's time to write the code to redirect INTERPRET's input stream from the keyboard to the blocks. I've sort of painted myself into a corner by committing to a plan that treats a BLOCK (of data or code) a bit differently from a BLOCK containing a "screen" (of code, if we're interpreting it). Misinterpreting INTERPRET is the blog post if anyone wants to

a) try to talk some sense into me about just using vanilla blocks and resist even more of the Feeping Creaturitis that seems to be going on here
b) help me do it

Re: screen editor design

Posted: Fri Jul 04, 2014 4:07 pm

by chitselb

I found this thread, "The most elegant Forth interpreter" and it got me thinking, what if I dispense with TIB #TIB SPAN >IN and just had WORD ? If I vector it, using JMP indirect (someuservariable) I can point it to one of a few conformant handlers. One to fetch tokens from the keyboard, another to fetch tokens from the 1000 PETSCII screen codes and 25 bits of linewrap I call a "screen", another to fetch tokens from 1024 byte classic ASCII blocks, and another to read sequential files.

Other than losing Forth-83 compliance (TIB #TIB SPAN >IN are required words) what would be wrong with that?

Re: screen editor design

Posted: Fri Jul 04, 2014 5:26 pm

by barrym95838

The subject is a bit too advanced for me to comment on it's feasibility, but maybe I could just offer the following:

How much effort and time would you need to explore this possibility far enough to form a solid opinion, and are you prepared to throw that effort and time away if it turns into a dead end?

It boils down to a "risk vs. benefit" type situation. I would be tempted to go for it, but your personal situation may be different than mine.

Best wishes either way,

Mike

Re: screen editor design

Posted: Sun Jul 06, 2014 5:07 am

by chitselb

All I'm doing here is reworking INTERPRET, from one of four sources
1) "TIB" terminal input buffer, an 80 character buffer that starts at TIB with SPAN characters in it from the last time the user hit return, and a cursor >IN that counts from 0 to 79
2) "BLOCK" 1024 characters that can be treated like a single continuous line

so far, nothing new.

3) "SCREEN" 1003 character structure that starts with 24 bits of linewrap followed by 1000 characters of Commodore screencodes (where "A" = $01 not $41, etc..) This is a tricky one. The screen is really nothing more than a block, tagged as a screen by a bit of metadata in the packet header. It is to be parsed as a collection of 40/80 column lines per the linewrap bits, with an implied carriage return (whitespace) after each line.
4) "FILE" sequential file of ASCII text, parsed as a continuous stream until EOF. Another tricky one, because tape loads 191 bytes of data at a time (and a 1-byte block type at the start of the tape buffer), so words probably will break in the middle between REFILL operations

Plan is to use SOURCE and REFILL words, consistently do lazy-loading within INTERPRET, and probably vector the SOURCE and REFILL words to handle the different source types.

Re: screen editor design

Posted: Mon Jul 14, 2014 4:11 pm

by chitselb

Well. It's working. I can edit, save, load and run screens of code.

Re: screen editor design

Posted: Mon Jul 14, 2014 4:49 pm

by BigEd

Good work!

Re: screen editor design

Posted: Tue Jul 15, 2014 4:12 am

by chitselb

I put together a quick and dirty (emphasis on the latter) Ruby script to analyze the binary for duplicate strings exceeding some minimum length, and squeezed and refactored as much of that out as I could.

Code: Select all

#!/home/chitselb/bin/ruby
# suchen.rb
#
# Analyzes a hexdump of the latest build, searching for duplicate strings
# that might be factorable
#
minimum_size=13


all = File.binread("build/pettil.obj")

i = 0
(all.size-minimum_size).times do
    seek = all[i+=1, minimum_size]
    j = all.index(seek,i+1)
    if !j.nil?
        seek.each_byte {|b| print b.to_s(16).rjust(2,'0')," "}
        print   " ", (i+0x6bfc).to_s(16).rjust(4,'0'), \
                " ", (j+0x6bfc).to_s(16).rjust(4,'0'), "\n"   
    end
end

Re: screen editor design

Posted: Tue Jul 15, 2014 12:25 pm

by Brad R

Say, that Ruby script looks handy. I may use it myself. Thanks!

Re: screen editor design

Posted: Fri Jul 18, 2014 8:25 pm

by chitselb

I'm redoing FORGET (so that it will actually work properly with redefinitions) New blog post has many of the gory details, and commit 8293a0057d6c33f01b5bdd5f50fec4e99576eb19 at http://github.com/chitselb/pettil reflects the almost-there code at the time of this post. I'm going to do it all over again, because 371 total bytes of code (so far) is unacceptable IMHO.

In other news, I turn 52 tomorrow.

Re: screen editor design

Posted: Wed Jul 23, 2014 2:30 am

by chitselb

FORGET works! It reawakens formerly forgotten (but redefined) words, even if they're in the transient dictionary. I'm pretty sure I got it right.

At some point I need to address automated regression testing. What I was thinking is to just throw together a few lines or maybe even a screen of code for each word to give it a workout and check the results vs. what was expected. How do other Forths do this?

Re: screen editor design

Posted: Wed Jul 23, 2014 12:02 pm

by Brad R

For ANS Forth I use the automated test suite developed by John Hayes. The basic test engine is here and the set of tests for the ANS core is here. Some tests are specific to ANS Forth, but most are generally applicable.

I gather that there is a more comprehensive ANS test here, but I haven't tried it, and most of the additions seem to be for the extension word sets that you haven't implemented.