Tali Forth for the 65c02

adrianhudson · Post by **adrianhudson** » Sat Dec 31, 2022 11:06 pm

Fantastic! Thanks for making this change, it makes the disassembler much more useful

( I have to say I am really impressed by Forth in general. It's fast and compact.)

SamCoVT · Post by **SamCoVT** » Sun Jan 01, 2023 4:09 pm

TLDR Version: I've updated the assembler further with string literal support so it doesn't try to disassemble the string data anymore.

VILR (Very Interesting... Let's Read!) Version: Tali inlines string data by compiling a jump over the string data, then a call to the string literal runtime handler, and then two cells that contain the address and length of the string:

Code: Select all

 Strings are compiled into the dictionary like so:
           jmp a
           <string data bytes>
  a -->    jsr sliteral_runtime
           <string address>
           <string length>

Tali used to try to disassemble the string data after the jump and would sometimes gack on it. To fix this, I needed to recognize the JMP/JSR sliteral_runtime pattern. When the assembler encounters a JMP instruction, it peeks ahead at the jump destination to see if the 3 bytes there are the JSR sliteral_runtime instruction. If it matches, it adjusts the current disassembly location to skip over the string data and continue at the jsr. There is then a special handler for JSR sliteral_runtime that prints the following string address and length and skips over those as well.

I thought about printing the string data, but Tali supports very long strings and they are shown in the memory dump when using SEE, so the address and length should be good enough in the disassembly. Here is an example of the new behavior, including typing in the address and length and TYPEing one of the strings.

Code: Select all

: teststrings s" This is a string literal" 2drop ." This is a printed string" ;  ok
see teststrings
nt: 800  xt: 813
flags (CO AN IM NN UF HC): 0 0 0 1 0 1
size (decimal): 78

0813  4C 2E 08 54 68 69 73 20  69 73 20 61 20 73 74 72  L..This  is a str
0823  69 6E 67 20 6C 69 74 65  72 61 6C 20 8A A0 16 08  ing lite ral ....
0833  18 00 20 D5 D7 E8 E8 E8  E8 4C 57 08 54 68 69 73  .. ..... .LW.This
0843  20 69 73 20 61 20 70 72  69 6E 74 65 64 20 73 74   is a pr inted st
0853  72 69 6E 67 20 8A A0 3F  08 18 00 20 DE A4  ring ..? ... ..

813    82E jmp
82E   A08A jsr     SLITERAL 816 18
835   D7D5 jsr     STACK DEPTH CHECK
838        inx
839        inx
83A        inx
83B        inx
83C    857 jmp
857   A08A jsr     SLITERAL 83F 18
85E   A4DE jsr     type
 ok
$816 $18 type This is a string literal ok

The 2DROP got inlined as the stack depth check and the INX instructions.

JimBoyd · Post by **JimBoyd** » Tue Jan 03, 2023 3:13 am

SamCoVT wrote:

Note that SEE changes the base to hex

Although I have RB to restore BASE for those words where I need to change it, I've followed advice given by Charles Moore and don't change BASE in my tools. If I want HEX output from SEE , or DUMP , I'll set BASE to HEX first.

adrianhudson · Post by **adrianhudson** » Wed Mar 15, 2023 3:36 pm

Hello all.
I am struggling with some relatively simple math (maths in the UK

).

I have a 16 bit number returned from a thermometer (sht31 if anyone is interested). It has to be converted using this formula
temperature = 175 * (rawtemp / 65535) - 45
I have come up with the following (which works - sort of - see below) Values are in hex.

: convertT ( rT -- T)
445C ( 17500 decimal )
m*
swap drop
1198 - (4500 decimal)
;
This gives results like (in decimal) 1950 for 19.50 degrees C - which is fine. This code is in effect doing the following:

temperature = (rawtemp ** 17500) - 4500 where ** is a 32 bit result with the lower 16 bits thrown away to act as as a division by 65536 (which is not 65535 but close!).

My problem is this:
The thermometer can give raw values of any magnitude ~$0000 to ~$FFFF (corresponding to ~-45C up to ~120C. The forth code above fails at input of $8000 and above, which is obviously to do with the sign of the number throwing the m*.

An unsigned m* - Um* would probably do the trick but Taliforth doesnt have such a thing.

Can anyone (better at Forth than me - which is almost anyone) help me please.

GARTHWILSON · Post by **GARTHWILSON** » Wed Mar 15, 2023 8:46 pm

Are you sure? I think that in most Forths, M* is a secondary that uses the UM* primitive in its definition.

BTW, SWAP DROP can be shortened to NIP.

I might write:

Code: Select all

: convertT ( rT -- T)  [ DECIMAL ]
   17500  UM*  NIP
    4500   -           [   HEX   ]       ;

Let us know how it works out. The only thing I've done for temperature is an LM335 circuit run into my A/D converter. I've never measured humidity.

adrianhudson · Post by **adrianhudson** » Wed Mar 15, 2023 9:18 pm

oops. You are right, there is a UM* ~blush~

Works perfectly now. Why I decided there was no um* I don't know.

Thanks for the NIP!

JimBoyd · Post by **JimBoyd** » Tue Mar 28, 2023 12:46 am

Does Tali Forth still inline DO LOOPs?

SamCoVT · Post by **SamCoVT** » Tue Mar 28, 2023 5:55 pm

JimBoyd wrote:

Does Tali Forth still inline DO LOOPs?

It does, and those words have special treatment so that they are always native compiled (inlined) when used, even if native compiling has been disabled by the user. It's a bit of a bear at 70 bytes for the two words, and it's not fast either. Druzyek and leepivonka have both done some investigating into the speed (or lack thereof) of Tali's DO LOOPs and I also investigated using an 8-bit index (which results in loops that take about 1/4 of the time or 1/2 the time if using I in them) here. I haven't personally run into an issue with speed, so I haven't spent too much time focusing on the issues involved in making it better. Here is an example of all of the overhead of doing a DO LOOP. Anyone trying to follow the assembly needs to the know that the ending value, and the "index", is subtracted from $8000 (this is sometimes called "fudge-factoring" the index) so that the oVerflow flag can be used to detect when to end the loop.

Code: Select all

: test do loop ;  ok
see test 
nt: 800  xt: 80C 
flags (CO AN IM NN UF HC): 0 0 0 1 0 1 
size (decimal): 70 

080C  A9 08 48 A9 51 48 38 A9  00 F5 02 95 02 A9 80 F5  ..H.QH8. ........
081C  03 95 03 48 B5 02 48 18  B5 00 75 02 95 00 B5 01  ...H..H. ..u.....
082C  75 03 48 B5 00 48 E8 E8  E8 E8 20 E9 97 18 68 75  u.H..H.. .. ...hu
083C  00 A8 B8 68 75 01 48 98  48 E8 E8 70 03 4C 36 08  ...hu.H. H..p.L6.
084C  68 68 68 68 68 68  hhhhhh

80C      8 lda.#
80E        pha
80F     51 lda.#
811        pha
812        sec
813      0 lda.#
815      2 sbc.zx
817      2 sta.zx
819     80 lda.#
81B      3 sbc.zx
81D      3 sta.zx
81F        pha
820      2 lda.zx
822        pha
823        clc
824      0 lda.zx
826      2 adc.zx
828      0 sta.zx
82A      1 lda.zx
82C      3 adc.zx
82E        pha
82F      0 lda.zx
831        pha
832        inx
833        inx
834        inx
835        inx
836   97E9 jsr     1
839        clc
83A        pla
83B      0 adc.zx
83D        tay
83E        clv
83F        pla
840      1 adc.zx
842        pha
843        tya
844        pha
845        inx
846        inx
847      3 bvs
849    836 jmp
84C        pla
84D        pla
84E        pla
84F        pla
850        pla
851        pla
 ok

JimBoyd · Post by **JimBoyd** » Thu Apr 06, 2023 1:02 am

Fleet Forth's DO LOOP's also use the $8000 "fudge factor" so they end when an overflow is detected.
I asked about the inlining because I've been working on some ideas for an STC version of Fleet Forth and have versions of the DO LOOP words which do not get inlined. The added overhead for (DO) and (?DO) would be minimal since they only run once per loop. I also think the added overhead for (LOOP) and (+LOOP) would also be minimal, thanks to a suggestion from leepivonka.
Another benefit, there is a separate (LOOP) primitive compiled by LOOP .
STC versions of Fleet Forth's DO LOOP's.
The use of SUBR (subroutine) was so I could test these words with the current version of Fleet Forth, an ITC Forth.
I realize you may not want to change Tali Forth's DO LOOP's. If it's not broken, don't fix it; however, it might be worth trying if you have a really big application with lots of DO LOOP's and find memory getting a little tight.

SamCoVT · Post by **SamCoVT** » Fri May 17, 2024 8:34 pm

JimBoyd wrote:

I asked about the inlining because I've been working on some ideas for an STC version of Fleet Forth and have versions of the DO LOOP words which do not get inlined. The added overhead for (DO) and (?DO) would be minimal since they only run once per loop.
...
I realize you may not want to change Tali Forth's DO LOOP's. If it's not broken, don't fix it; however, it might be worth trying if you have a really big application with lots of DO LOOP's and find memory getting a little tight.

Patrick (pdragon here) has been working on Tali Forth's loops and has made them significantly smaller and faster. Indeed, moving to a non-inlined DO was worthwhile, but LOOP is still inlined. Patrick also added a change that holds (caches, really) the LSB of the current count in zero page, and moved the loop info off of the return stack (to 0x100 and growing upwards, but could be moved anywhere in RAM).

Patrick actually wrote three different alternative looping options and benchmarked them against each other. We ended up choosing the "Loop Control Block - Simple" (lcb simple) version because it offered the best speedup while still being relatively simple to implement. Tali's test suite has a cycle counter added to py65mon, so we were able to benchmark all three options and saw the following speedups (the numbers are 65C02 cycle counts (clocks)):

Code: Select all

: do?word1 5 5 ?do loop ;
    original    322
    master      322     ; all about the same
    push/pull   325
    lcb cmplx   325
    lcb simple  325
    
: do?word2 100 0 ?do i drop loop ;  
    original   12836
    master      8384
*   push/pull   6052    ; -6 cycles for i 
    lcb cmplx   7218    ; speculatively incrementing i
    lcb simple  6658    
    
: doword 100 0 do loop ;
    original    6700
    master      2148
*   push/pull   1304    ; simple one-level loop
    lcb cmplx   2651
*   lcb simple  1310    ; about the same
        
: dowordi 100 0 do i drop loop ;
    original   12700
    master      8248
    push/pull   5904    ; -6 cycles for i
    lcb cmplx   7169
    lcb simple  6511
        
: dodoword 100 0 do 10 0 do loop loop ;
    original    90500
    master      44748
    push/pull   42704
    lcb cmplx   51318
*   lcb simple  33410   ; 9294/100 => 90 cycles better for nested loops
        
: dodowordij 100 0 do 10 0 do i drop j drop loop loop ;
    original    210500
    master      165748
    push/pull   149704  ; not a huge difference when all native comple
    lcb cmplx   158217
    lcb simple  156410  ; default J is a JSR; +6 cycles for i
*   lcb simple' 144410  ; forcing J native compile
    
: dodowordbigi 10 0 do 1024 0 do i drop loop loop ; 
    original    1282730
    master      822298
*   push/pull   587424  ; -6 cycles for xt_i
    lcb cmplx   700317  
    lcb simple  648060  ; diff 60636 = +61440 for `i`, 804 better otherwise
    
    
: doword+loop 100 0 do 5 +loop ;
    original    2420
    master      2418
*   push/pull   2291
    lcb cmplx   2601
*   lcb simple  2213    ; faster when step `<256`

Patrick has made a few more small optimizations since then, but you can see that there was a lot of improvement to be had in many loop configurations. If you are interested in the "Loop Control Block" scheme that Tali now uses, let me know and I can provide more details as to the inner workings.

JimBoyd · Post by **JimBoyd** » Tue May 21, 2024 10:48 pm

SamCoVT wrote:

If you are interested in the "Loop Control Block" scheme that Tali now uses, let me know and I can provide more details as to the inner workings.

It does sound interesting; however, the STC version of Fleet Forth has been put on a back burner for now and it could be a while before I can do anything with this information.

adrianhudson · Post by **adrianhudson** » Sun Mar 16, 2025 3:32 pm

I am doing a bit more FORTH programming. I have a word that may need to bail out with a message if it encounters an error. The thing is it is 3 levels deep and I want it to continue execution at the first level. Normally I understand this is solved by THROW and CATCH which Tali doesn't support. What is the best way of achieving this in Tali please anyone?

leepivonka · Post by **leepivonka** » Mon Mar 17, 2025 5:26 am

I can think of several alternatives:

If you know how to clean the parameter & return stacks:

Code: Select all

Define a word:
    : RDrop ( R: addr -- ) \ discard a return stack entry
      [ $68 c, $68 c,  \ pla; pla
      ] ; always-native compile-only
Use RDrop as needed to remove return address cells from the return stack.
Use Drop as needed to remove cells from the parameter stack.

If you want to return all the way back to the FORTH interpreter:

Code: Select all

Use Quit or Abort or Abort" to end application execution

Use returned flags to indicate word execution should be cut short:

Code: Select all

A called word would return a flag or set a global variable to indicate callers should quit.
The word's caller would test the flag & cut processing short as appropriate.

Or start implementing Throw & Catch for Tali Forth.

adrianhudson · Post by **adrianhudson** » Mon Mar 17, 2025 3:30 pm

@leepivonka
Many thanks!
"Or start implementing Throw & Catch for Tali Forth"
If only I knew how!

barrym95838 · Post by **barrym95838** » Mon Mar 17, 2025 5:01 pm

I immediately thought of using in-line RDROP(s) to unnest the return stack but I was too timid to post, since I'm a stark beginner in programming FORTH. I know a lot about the internal nuts and bolts but I've never taken it for a serious test drive.

Tali Forth for the 65c02

Re: Tali Forth for the 65c02

Re: Tali Forth for the 65c02

Re: Tali Forth for the 65c02

Re: Tali Forth for the 65c02

Re: Tali Forth for the 65c02

Re: Tali Forth for the 65c02

Re: Tali Forth for the 65c02

Re: Tali Forth for the 65c02

Re: Tali Forth for the 65c02

Re: Tali Forth for the 65c02

Re: Tali Forth for the 65c02

Re: Tali Forth for the 65c02

Re: Tali Forth for the 65c02

Re: Tali Forth for the 65c02

Re: Tali Forth for the 65c02