kc5tja wrote:
I'm getting pretty sick and tired of all the mis-information that exists about Forth.
Getting laughed at for using Forth comes with the territory, unfortunately. A lot of people are less than receptive to Forth, but at least they don't laugh as hard once they see the sort of fast and powerful programs that can be written with it.
kc5tja wrote:
It's also a fallacy that if a language is interpreted, it must necessarily be slow.
True, but most interpreted languages (e.g. BASIC) are/were optimized for space, not for speed. When speed became an issue, they were compiled and the interpreter was ditched. If you look at the heyday of interpreted languages, fast interpretation wasn't exactly a popular approach (e.g. thanks to FIG-Forth -- and the Forth standards -- indirect threading was common on the 6502, even though subroutine threading is much faster, and in many ways simpler -- well, it's not interpreted either, so this isn't a perfect example), and common misconceptions like this arise.
kc5tja wrote:
The representation of the compiled software can range from a vector of addresses (where each address represents another Forth word), all the way up to natively generated machine language. I often have to wonder why BASIC interpreters of old didn't represent their programs in this manner -- it sure would have made them run so very much faster.
Early BASIC interpreters were written with two dominant restrictions: (a) a small amount of ROM to work with (maybe 8k, maybe only 4k), and (b) a small amount of RAM to work with (sometimes as little as 1k, though 4k was more common) to work with. With minimal space as the driving force, the result was simple parsers that don't optimize (much) and tokenized programs. Native code did eventually happen -- with BASIC compilers.
I've been jotting down ideas for a fast interpreted BASIC that I haven't started writing yet. In terms of major 6502 projects, right now it's behind at least my write up of 6502 decimal mode (including undocumented behavior), which I am currently working on (but isn't getting any closer to being completed now that I'm writing long-winded forum posts), and finishing my functional, but incomplete and undocumented 6502 subtroutine threaded Forth.
My BASIC was partly inspired by Apple (Integer) BASIC, which was written by Steve Wozniak. Wozinak used ideas from HP BASICs (he had worked at HP), and of course HP was well-known for their stack-oriented RPN calculators. "Integer" BASIC is really not all that far away from the address interpreter you suggest. It uses a sophisticated parser which does all of it's syntax checking at parse time rather than execution time. It also converts constants at parse time to their twos complement reprepresentation rather than storing them as ASCII. Therefore, unlike many BASICs (e.g. EhBASIC and Applesoft) you never got a SYNTAX ERROR at run time, and you weren't converting constants (e.g. the statment PRINT 99) or line numbers (e.g. the statement IF A < B THEN 10) each and every time they were encountered.
There were other things that could have been handled at parse time, but weren't, such as checking for too many parentheses in an expression, and determining whether you needed the address of a variable so you could store a value in it (e.g. a LET or an INPUT statement) or if only the contents of the variable were needed (e.g. PRINT A). (Determining the latter at parse time would have sped up expression evaluation considerably.)
To use Wozniak's terms, the tokenized program contained "nouns" and "verbs". Nouns were constants (numbers), strings literals, and variable names, and verbs were basically everything else. In Forth terms, an example of a compiled literal (in assembly) is DW LIT,10. LIT would be the verb and 10 would be the noun.
The BASIC program was stored in infix order. All verbs took some action (although in some cases the action was to do nothing). Different tokens could represent the same keyword (verb). For example, there were three tokens that would list as PRINT. One token was for PRINTing numbers, one for strings, and one for CRs (i.e. when PRINT had no arguments). In Forth, these three actions have three different names: . (dot) TYPE and CR, but in BASIC they all had the name PRINT. Which token to use was determined at parse time based on syntax and context, so the parser was quite clever. Interestingly, even delimiters such as commas and semicolons were verbs. So a statement like PRINT 12+34;56 would would be stored as:
Code:
DB PRINT_number_token
DB literal_number_token
DW 12
DB plus_token
DB literal_number_token
DW 34
DB semicolon_print_number_token
DB literal_number_token
DW 56
The tokens for PRINT and semicolon were different (but this was only so that LIST knew which one to print), but used the exact same routine (i.e. both went to the same address). The Forth equivalent, 12 34 + . 56 . (usually) compiles to DW LIT,12,LIT,34,PLUS,DOT,LIT,56,DOT. Forth can just execute "verbs" as it encounters them, but since the BASIC program is in infix order, the (inner) interpreter must check precedence to determine if it can execute a verb.
Incidentally, there were other examples of multiple tokens for a single keyword. An equals sign (=) had one token for a numeric comparison (e.g. IF A=1 THEN PRINT), a different token for a string comparison (e.g. IF A$="A" THEN PRINT), an assignment statement (e.g. LET A=1), etc. There are over a dozen tokens that LIST as a comma, since the comma in a POKE 0,0 statement takes a different action than the comma in a NEXT A,B statement, etc.
You may have already guessed one idea I've got: use a sophisticated parser AND a sophisticated LIST. The program would be entered in infix order, stored in postfix order, and LISTed in infix order. Thus, the parser would convert infix to postfix, and LIST would convert postfix back to infix. By moving precedence checking from the (inner) interpreter to the parser and to LIST, the interpreter will be very fast (and Forth-like).
In Forth, to insert code, you have to recompile everything after the insertion point, since the addresses of any subsequent words will change. In BASIC, I'd like to retain the ability to insert a line painlessly, so I am planning to start with tokens rather than addresses. It's possible to write a token interpreter that compares favorably to an addresses interpreter, in terms of speed.
I've gathered enough other ideas (e.g. I like the Acorn assembler idea, but I'm planning to try a one-pass assembler instead) that intrigue me that I plan to write this BASIC at some point. Obviously, the plans I've mentioned are subject to change once I start writing and discover what the tradeoffs are.
kc5tja wrote:
The disadvantage of this method is that it consumes slightly more space. But, this is a classic size/speed trade-off issue, and the ratio isn't that bad in practice.
Another Forth misconception! In terms of space, it is not necessarily true that subroutine threading > direct threading > indirect threading.
The code definition example: CODE z NOP, NEXT, END-CODE
Indirect threading (6 bytes):
Code:
z: DW z+2
NOP
JMP NEXT
Direct threading (4 bytes):
Code:
z: NOP
JMP NEXT
Subroutine threading (2 bytes):
Code:
z: NOP
RTS
So for code definitions, indirect threading is always the largest, and subroutine threading is always smallest (and fastest, to boot). Subroutine threading also seems to encourage writing more code definitions, so there's quite a bit a space savings that can be had.
The colon definition example: : z a b c d ;
Indirect threading (12 bytes):
Code:
z: DW DO_COLON
DW a
DW b
DW c
DW d
DW EXIT
Direct threading (13 bytes):
Code:
z: JSR DO_COLON
DW a
DW b
DW c
DW d
DW EXIT
Subroutine threading (12 bytes):
Code:
z: JSR a
JSR b
JSR c
JMP d
For a shorter definition, subroutine threading will take the least amount of space. The colon definition example: : z a b c d e ;
Indirect threading: 14 bytes
Direct threading: 15 bytes
Subtroutine threading: 15 bytes
For anything longer, yes, subtroutine threading will take the most space, but good Forth programming practice dictates that definitions should be short, so the space penalty of subroutine threading is smaller than you might think.
You can also save space (and increase speed) with subroutine threading in some cases by inlining assembly, rather than writing a single-use code definition. In subroutine threading, with no NEXT routine, the Y register is usually free (I've even used it as a loop counter, to get a terrific benefit in both speed and space).
When I decided to write a subroutine threaded Forth, I thought I was going to get clobbered in terms of space. But that hasn't been the case so far. Now, if I could just bring myself to use postfix assembly with subroutine threading, rather than the more-legible (at least to me) prefix assembly...