various PETTIL design considerations

barrym95838 · Post by **barrym95838** » Fri Jul 04, 2014 5:26 pm

The subject is a bit too advanced for me to comment on it's feasibility, but maybe I could just offer the following:

How much effort and time would you need to explore this possibility far enough to form a solid opinion, and are you prepared to throw that effort and time away if it turns into a dead end?

It boils down to a "risk vs. benefit" type situation. I would be tempted to go for it, but your personal situation may be different than mine.

Best wishes either way,

Mike

chitselb · Post by **chitselb** » Sun Jul 06, 2014 5:07 am

All I'm doing here is reworking INTERPRET, from one of four sources
1) "TIB" terminal input buffer, an 80 character buffer that starts at TIB with SPAN characters in it from the last time the user hit return, and a cursor >IN that counts from 0 to 79
2) "BLOCK" 1024 characters that can be treated like a single continuous line

so far, nothing new.

3) "SCREEN" 1003 character structure that starts with 24 bits of linewrap followed by 1000 characters of Commodore screencodes (where "A" = $01 not $41, etc..) This is a tricky one. The screen is really nothing more than a block, tagged as a screen by a bit of metadata in the packet header. It is to be parsed as a collection of 40/80 column lines per the linewrap bits, with an implied carriage return (whitespace) after each line.
4) "FILE" sequential file of ASCII text, parsed as a continuous stream until EOF. Another tricky one, because tape loads 191 bytes of data at a time (and a 1-byte block type at the start of the tape buffer), so words probably will break in the middle between REFILL operations

Plan is to use SOURCE and REFILL words, consistently do lazy-loading within INTERPRET, and probably vector the SOURCE and REFILL words to handle the different source types.

chitselb · Post by **chitselb** » Mon Jul 14, 2014 4:11 pm

Well. It's working. I can edit, save, load and run screens of code.

BigEd · Post by **BigEd** » Mon Jul 14, 2014 4:49 pm

Good work!

chitselb · Post by **chitselb** » Tue Jul 15, 2014 4:12 am

I put together a quick and dirty (emphasis on the latter) Ruby script to analyze the binary for duplicate strings exceeding some minimum length, and squeezed and refactored as much of that out as I could.

Code: Select all

#!/home/chitselb/bin/ruby
# suchen.rb
#
# Analyzes a hexdump of the latest build, searching for duplicate strings
# that might be factorable
#
minimum_size=13


all = File.binread("build/pettil.obj")

i = 0
(all.size-minimum_size).times do
    seek = all[i+=1, minimum_size]
    j = all.index(seek,i+1)
    if !j.nil?
        seek.each_byte {|b| print b.to_s(16).rjust(2,'0')," "}
        print   " ", (i+0x6bfc).to_s(16).rjust(4,'0'), \
                " ", (j+0x6bfc).to_s(16).rjust(4,'0'), "\n"   
    end
end

Brad R · Post by **Brad R** » Tue Jul 15, 2014 12:25 pm

Say, that Ruby script looks handy. I may use it myself. Thanks!

chitselb · Post by **chitselb** » Fri Jul 18, 2014 8:25 pm

I'm redoing FORGET (so that it will actually work properly with redefinitions) New blog post has many of the gory details, and commit 8293a0057d6c33f01b5bdd5f50fec4e99576eb19 at http://github.com/chitselb/pettil reflects the almost-there code at the time of this post. I'm going to do it all over again, because 371 total bytes of code (so far) is unacceptable IMHO.

In other news, I turn 52 tomorrow.

chitselb · Post by **chitselb** » Wed Jul 23, 2014 2:30 am

FORGET works! It reawakens formerly forgotten (but redefined) words, even if they're in the transient dictionary. I'm pretty sure I got it right.

At some point I need to address automated regression testing. What I was thinking is to just throw together a few lines or maybe even a screen of code for each word to give it a workout and check the results vs. what was expected. How do other Forths do this?

Brad R · Post by **Brad R** » Wed Jul 23, 2014 12:02 pm

For ANS Forth I use the automated test suite developed by John Hayes. The basic test engine is here and the set of tests for the ANS core is here. Some tests are specific to ANS Forth, but most are generally applicable.

I gather that there is a more comprehensive ANS test here, but I haven't tried it, and most of the additions seem to be for the extension word sets that you haven't implemented.

chitselb · Post by **chitselb** » Wed Jul 23, 2014 4:52 pm

Thanks, that's very useful! In PETTIL, underflowing the stack first poops on the I do-loop index, then the ILIM do-loop limit, and then the User Pointer followed by the NEXT routine (usually with disastrous results). I notice in here that he just pushes zeroes on the stack to bring it back. Conceptually, it looks like it's a matter of comparing expected stack results vs. actuals, and that's a pretty good start.

chitselb · Post by **chitselb** » Wed Jul 23, 2014 9:50 pm

That was easy! Now I just have to write piles and piles of tests, and figure out some way to capture and/or summarize the output so it won't just go scrolling off the screen.

For my next trick, I'm revisiting vocabulary search order. The DPANS document http://lars.nocrew.org/dpans/dpansa16.htm has me a little confused, but the word set itself at http://lars.nocrew.org/dpans/dpans16.htm makes some sense. My scheme is to set the "V" bit and append a Vocabulary ID(tm) byte to names that are in a vocabulary other than FORTH. Vocabulary IDs start with 0x01, which will be the ASSEMBLER vocabulary in PETTIL. As far as search order, I'm a little up in the air and open to (pleading for) suggestions.

Within FIND I'd like there to be some byte array, maybe 10 or 12 bytes, with a list of vocabularies to search through. FIND would process these in left to right order. The last one would always be 0x00 for the core FORTH word set. Maybe there is value in being able to search FORTH before searching vocabularies? Having the zero come last would make it easier for FIND to know that it's done. There would be a parent-child relationship among vocabularies, with FORTH being the root parent of all vocabularies. Here's an imaginary use-case:

VOCABULARY LATIN
LATIN DEFINITIONS
: EPLURIBUSUNUM ;
VOCABULARY PIGLATIN
PIGLATIN DEFINITIONS
: IXNAY ;
FORTH DEFINITIONS
VOCABULARY FRENCH
FRENCH DEFINITIONS
: FROMAGE ;
: OUI ;
IXNAY not found
FROMAGE ok
DUP DROP ok ( forth is accessible from everywhere, because it gets searched last )
PIGLATIN not found
FORTH PIGLATIN not found
LATIN ok
EPLURIBUSUNUM ok
IXNAY not found
PIGLATIN ok
IXNAY ok
EPLURIBUSUNUM ok
FROMAGE not found

chitselb · Post by **chitselb** » Tue Jun 02, 2015 3:16 am

It has been a while since I visited this thread. The as-of-right-now commit of http://github.com/chitselb/pettil is pretty stable, after spending way too much time rewriting BLOCK and UPDATE and a foray into +LOOP , but those are stable too, as is the editor. Yes, even the editor which has been broken for about a month. Time to revisit <BUILDS ... DOES> ...

There are two separate dictionary spaces, "core" and "transient" with a symbol table "symtab". Once all the editing, assembling, interpreting, compiling (in short, anything that touches the symbol table) are done with, only "core" will remain, currently residing from $0400-$1B1F. Virtual memory (BLOCK / UPDATE / LOAD-BUFFERS / SAVE-BUFFERS) will stick around in "core" and everything above that will become room for application code and data.

Without a symbol table, CREATE is pretty worthless, so it lives in the transient dictionary. I'd like for all the code in between <BUILDS and DOES> to also go up in transient, something like

Code: Select all

USER TDP ( transient dictionary pointer, was initialized at cold start )
: @SWAP!  ( var1 var2 -- )
    2DUP 2>R @ SWAP @ R> !  R> ! ;
: NEW-PARENT-WORD  ( address value == ) ( -- )
 <BUILDS  , ,  DOES> DUP @ SWAP 2+ @ EXECUTE ; ( or something similar )
: DRAW   ( value -- ) ... code to draw something ;
' DRAW 42 NEW-PARENT-WORD CHILD-WORD

"<BUILDS" is immediate and performs a DP TDP @SWAP! to target the transient dictionary.
the " , ,XT " code is then compiled in the transient dictionary
"DOES>" is also immediate, switches the dictionaries again, the rest of the code compiles to core dictionary
the " DUP @ SWAP 2+ @ EXECUTE " runtime behavior of all child words is compiled once in the core dictionary at newparentwordruntime
The top of core would now look something like:

Code: Select all

newparentwordruntime
    .word dup  ( jsr enter is not required or compiled for DOES> blocks. dodoes points IP to here )
    .word fetch
    .word swap
    .word twoplus 
    ...etc...
    .word execute
    .word exit  ( exits the child word )

child-word
    jsr dodoes ( CFA )
    .word newparentwordruntime ( PFA+0 )
    .word 42  ( PFA + 2 )
    .word draw  ( PFA + 4)

This means <BUILDS and DOES> would both be immediate, to switch dictionaries when NEW-PARENT-WORD is compiled. This feels messy. I also don't like using PFA+0 to store the address of the child word's runtime behavior, because that is something that probably belongs in the child word CFA.

It would be nice if CREATE ... DOES> also worked, but in a more vanilla way, without the target dictionary swapping. These behave the same, with the difference being where the " , , " is compiled. For 2KONSTA it goes in TDICT, at TDP, and for 2KONSTB it goes in core, at HERE .

Code: Select all

: 2KONSTA <BUILDS , , DOES> 2@ ;
: 2KONSTB CREATE , , DOES> 2@ ;

DOES> would only swap dictionaries when DP @ is greater than TDICT @ (pointer to the base of the transient dictionary, presently $6C00 on a 32K PET). I suppose it would be possible to associate a list of behaviors (like Java's class methods) with a parent/defining word and have each child word's PFA+0 point to the base of that list, with the remainder of the child's PFA behaving like Java's instance fields.

That's where the design is for now, tormenting me in my dreams and waking hours. All feedback (even "what a stupid idea! and here's why...") will be welcome

chitselb · Post by **chitselb** » Wed Jun 03, 2015 1:12 am

Today I went through all the PETTIL assembler code and ensured that every secondary and 'JSR word' (it's direct-threaded) would have the CFA pointing directly at the JSR instruction, instead of the NOP or two NOPs that precede it when page alignment is required. I also rewrote ": (CREATE) ( cfa -- ) ;" to do the same. That means if I do CREATE FOO FOO 2- (or 3-) I can change that first instruction or the address at run time, which opens up the possibility of late binding. I'm happy with that.

chitselb · Post by **chitselb** » Tue Aug 11, 2015 12:27 am

I don't mean to brag, I don't mean to boast, but I like hot butter on my breakfast toast. Also, I recoded the Ragsdale assembler and got it to fit in 569 (decimal) bytes of code plus 576 bytes of symbol table, total 1145 bytes.

EDIT: oops. ouch. denied. 767 code + 648 symbol after adding BEGIN, UNTIL, IF, THEN, ELSE, AGAIN, WHILE, REPEAT, NOT CS 0= 0< >= VS words

chitselb · Post by **chitselb** » Fri Aug 21, 2015 7:47 pm

A little research shows that the rewrite of Ragsdale's assembler was still a huge win in memory. PETTIL's words: { doasmmode docpu INSTR, DECODE M/CPU } total 128 bytes of code, and are headerless (occupy no symbol table space). In Blazin' Forth's assembler, the words { M/CPU UPMODE CPU } take up 251 bytes, which includes the word names, link fields and code field address, as Blazin' is ITC. A net savings of 123 bytes for rewriting the assembler in 6502 assembler

various PETTIL design considerations

Re: screen editor design

Re: screen editor design

Re: screen editor design

Re: screen editor design

Re: screen editor design

Re: screen editor design

Re: screen editor design

Re: screen editor design

Re: screen editor design

Re: screen editor design

Re: screen editor design

Re: screen editor design

Re: screen editor design

Re: various PETTIL design considerations

Re: various PETTIL design considerations