6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Sat Nov 23, 2024 1:25 pm

All times are UTC




Post new topic Reply to topic  [ 9 posts ] 
Author Message
PostPosted: Sat Jan 26, 2019 1:46 am 
Offline

Joined: Fri May 05, 2017 9:27 pm
Posts: 895
SEE is defined on some Forth systems to reconstruct ( or try to reconstruct) the source of a Forth word. This seems to me to be a cute parlor trick, especially if the Forth source is readily available.
I feel that SEE can be a useful tool if, instead of attempting to show the source of a Forth word, it shows what actually got compiled. Here are my criteria for a useful SEE:
1 Display what was compiled for Forth words and code words.
2 Recognize the built in data types ( VARIABLE , CONSTANT , 2VARIABLE , 2CONSTANT , and USER ) and display their values.
3 Display a DEFERed word's vector ( the word a DEFERed word is set to execute) .
4 Transition seamlessly from decompiling/disassembling High level code to Low level and back. For example:
Code:
SEE 2CONSTANT
2CONSTANT
 2F0E CREATE
 2F10 ,
 2F12 ,
 2F14 DOES
 2F16 2429    JMP ' DOES> >BODY 15 +
 2F19 2@
 2F1B EXIT
 OK

SEE CONSTANT
CONSTANT
 23CF CREATE
 23D1 ,
 23D3 DOES
 23D5    2  # LDY
 23D7   FE )Y LDA  W
 23D9         PHA
 23DA         INY
 23DB   FE )Y LDA  W
 23DD  838    JMP PUSH
 OK

SEE DEFER
DEFER
 23A6 CREATE
 23A8 LIT 237C
 23AC ,
 23AE DOES
 23B0    2  # LDY
 23B2   FE )Y LDA  W
 23B4         PHA
 23B5         INY
 23B6   FE )Y LDA  W
 23B8   FF    STA  W 1+
 23BA         PLA
 23BB   FE    STA  W
 23BD    0  # LDY
 23BF   FD    JMP  W 1-
 OK

Notice that in these examples DOES , the primitive compiled by DOES> and ;CODE , sets the code field of the latest word in the current vocabulary to point to the code following DOES and exits. These defining words do not transition from high level code to low level, but the decompiler does.
5 SEE is just a wrapper word for (SEE) so (SEE) can take a CFA from the data stack.
Code:
: SEE  ( -- )  ' (SEE) ;

6 Know when to stop decompiling and when, or if, it should transition from high or low level to the other.
7 Defined with helper words :DIS and DIS that decompile : ( colon) definitions and code definitions respectively. In some situations, the ones that 'trip up' SEE , it is useful to decompile a Forth word starting on a word boundary, inside the body of the definition, with :DIS ,or disassemble a low level word starting on an instruction boundary, with DIS . This example is a partial disassembly of FIND , the high level word that uses (FIND) and VFIND , the find primitives.
Code:
2091 :DIS
 2091 CURRENT
 2093 @
 2095 VFIND
 2097 EXIT
 OK

The full disassembly of FIND for comparison:
Code:
SEE FIND
FIND
 2087 CONTEXT
 2089 @
 208B (FIND)
 208D ?DUP
 208F ?EXIT
 2091 CURRENT
 2093 @
 2095 VFIND
 2097 EXIT
 OK

8 A word like DONE? is included in the definition of the decompiler(s) which can pause the listing and wait if a key is pressed or return with a flag. It is possible SEE will encounter a situation that makes it keep decompiling because none of the stop criteria are met, especially if it's used to examine the code of a new construct like DOER/MAKE .
Code:
: DONE?  ( -- F )
   ?KEY DUP 0EXIT
   3 = ?DUP ?EXIT
   KEY 3 = ;

DONE? is used like this:
Code:
: SOME.FORTH.WORD
   BLAH BLAH BLAH
   BEGIN
      BLAH BLAH BLAH
      TEST.CONDITION DONE? OR
   UNTIL
   BLAH BLAH ;

The loop in SOME.FORTH.WORD keeps running until the test condition is true or the C64's STOP key is pressed ( DONE? returned a TRUE, we're done. If any other key is pressed, DONE? will wait for a key press. If the stop key is pressed, a TRUE flag is returned, otherwise FALSE is returned.

A SEE written with these criteria in mind, along with DUMP , a word to display memory contents, can be a useful tool when performing a 'postmortem' after an attempt to add some new feature to one's Forth fails.

Cheers,
Jim


Top
 Profile  
Reply with quote  
PostPosted: Sat Jan 26, 2019 3:22 am 
Offline
User avatar

Joined: Fri Aug 30, 2002 1:09 am
Posts: 8545
Location: Southern California
You got pretty ambitious there, doing disassembly too! :D My SEE doesn't do that, but I did have it show more information. SEE has the following line in its definition to print a heading above a table of the decompiled word's contents:
Code:
   BOLDON
      ." word# addr  hex   ±hex   dec   ±dec   word name" 75 TAB ." ASCII"
   BOLDOFF

The hex ±hex dec ±dec part is to show the cell contents in signed and unsiged hex and decimal, in case the cell is numerical data rather than the CFA of a word. The ASCII part is of course for if it's part of a string. Above all that, it prints the name of the word, whether or not it's IMMEDIATE, and then its NFA, LFA, and CFA. I can't remember the last time I used it though.

_________________
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?


Top
 Profile  
Reply with quote  
PostPosted: Sat Jan 26, 2019 4:05 am 
Offline

Joined: Fri May 05, 2017 9:27 pm
Posts: 895
The last time I used SEE was to make sure the DOER/MAKE words compiled the right stuff.
With my mixing of high level and assembly code, it's good to have SEE able to decompile/disassemble both.
I took a tip from Charles Moore and don't let SEE mess with BASE. In my SEE , I used unsigned numeric display in the current base. My examples show HEX used, but it works just as well in decimal. To maintain proper spacing, I use 16BITS .R and 8BITS .R as appropriate.
Code:
5 CONSTANT 16BITS
: 8BITS  ( -- U )
   16BITS 1+ 2/ ;
: SETWIDTH  ( -- )
   FFFF 0 (UD.) NIP IS 16BITS ;

SEE's components include SETWIDTH.


Top
 Profile  
Reply with quote  
PostPosted: Fri Feb 01, 2019 11:27 pm 
Offline

Joined: Fri May 05, 2017 9:27 pm
Posts: 895
A useful change to my SEE is to include the address of the words that form a high level definition. Some words may be headless and some names may be in more than one vocabulary. Including the address is one sure way to know which word is used. Here is a disassembly of PLISTS , plain lists. It lists a range of blocks plainly ( no line numbers and no blank lines within a screen ).
Code:
SEE PLISTS
PLISTS
 5424 1281 1+
 5426 13EA SWAP
 5428  8B2 (DO) 5438
 542C  FEC I
 542E 53E8 PLIST
 5430 2F13 DONE?
 5432  93B ?LEAVE
 5434  8E8 (LOOP) 542C
 5438  94F EXIT
 OK

The address of I , $0FEC, shows that it is in the kernel and is therefore the I defined in the FORTH vocabulary, not the I ( short for insert) defined in the EDITOR vocabulary.
There is one other thing. I was wondering how the formatting looks. Particularly when the disassembly transitions from high level to low and vice versa.
Code:
SEE #
#
 1CEA  AF4 BASE
 1CEC 1313 @
 1CEE 15A2 UD/MOD
 1CF0 13C1 ROT
 1CF2  9CE (>ASSEM)
 1CF4    0 ,X LDA
 1CF6    A  # CMP
 1CF8 1CFC    BCC
 1CFA    6  # ADC
 1CFC   30  # ADC
 1CFE    0 ,X STA
 1D00  9AA    JSR ' (>FORTH) >BODY
 1D03 1CC0 HOLD
 1D05  94F EXIT
 OK

Cheers,
Jim


Top
 Profile  
Reply with quote  
PostPosted: Wed Mar 06, 2019 9:00 pm 
Offline

Joined: Fri May 05, 2017 9:27 pm
Posts: 895
JimBoyd wrote:
Here are my criteria for a useful SEE:
1 Display what was compiled for Forth words and code words.
2 Recognize the built in data types ( VARIABLE , CONSTANT , 2VARIABLE , 2CONSTANT , and USER ) and display their values.

I've given that second criterion some thought and simplified my SEE . It still has the same features for high level and code decompiling/disassembling, but SEE now tests for the following types:
1 code words -- disassemble as before
2 high level words -- decompile as before
Both still transition between code and high level as needed.
3 deferred words -- show what the 'action' is but don't disassemble it. Show its name and address of its CFA.
4 All others are assumed to be CREATE DOES> or CREATE ;CODE words.
For these words it finds the size of the PFA.
A) If it is zero, disassemble the code pointed to by the CFA.
B) If it is not zero, show the name of the word which sets this word's code field and the address pointed to by the CFA. show the address and size of the PFA.

Here is a sample session:
Code:
 OK
VIEW OFF  OK
3 4 PLISTS
SCR# 3
// COMPATIBILITY
HEX
: CELLS   2* ;
: CELL+   2+ ;

SCR# 4
// TEST CREATE DOES>
HEX
: LEVEL0  ( N -- )
   CREATE , DOES>  ( N2 -- )
   CREATE @ , , DOES>  ( N3 -- )
   CREATE 2@ , , , DOES>  ( N4 -- )
   CREATE DUP CELL+ 2@ ROT @ , , , ,
      DOES>  ( -- )
      DUP 2 CELLS + 2@ ROT 2@ 4 0
      DO  U.  LOOP ;
2 LEVEL0 LEVEL1
3 LEVEL1 LEVEL2
5 LEVEL2 LEVEL3
7 LEVEL3 LEVEL4
CR LEVEL4
 OK
3 4 THRU 3 4
2 3 5 7  OK
SEE LEVEL0
LEVEL0
 57CE 229E CREATE
 57D0 1BEE ,
 57D2 1CB8 DOES
 57D4 2475    JMP ' DOES> >BODY 15 +
 57D7 229E CREATE
 57D9 1319 @
 57DB 1BEE ,
 57DD 1BEE ,
 57DF 1CB8 DOES
 57E1 2475    JMP ' DOES> >BODY 15 +
 57E4 229E CREATE
 57E6 1363 2@
 57E8 1BEE ,
 57EA 1BEE ,
 57EC 1BEE ,
 57EE 1CB8 DOES
 57F0 2475    JMP ' DOES> >BODY 15 +
 57F3 229E CREATE
 57F5 1428 DUP
 57F7 57BD CELL+
 57F9 1363 2@
 57FB 13F7 ROT
 57FD 1319 @
 57FF 1BEE ,
 5801 1BEE ,
 5803 1BEE ,
 5805 1BEE ,
 5807 1CB8 DOES
 5809 2475    JMP ' DOES> >BODY 15 +
 580C 1428 DUP
 580E  A8A 2
 5810 57AF CELLS
 5812 12ED +
 5814 1363 2@
 5816 13F7 ROT
 5818 1363 2@
 581A  862 CLIT 4
 581D  BB0 0
 581F  8B2 (DO) 5829
 5823 1DDD U.
 5825  8E8 (LOOP) 5823
 5829  94F EXIT
 OK
SEE LEVEL1
LEVEL1
LEVEL0 57D4
PFA: 5836    2 OK
SEE LEVEL2
LEVEL2
LEVEL0 57E1
PFA: 5843    4 OK
SEE LEVEL3
LEVEL3
LEVEL0 57F0
PFA: 5852    6 OK
SEE LEVEL4
LEVEL4
LEVEL0 5809
PFA: 5863    8 OK
8 PLIST
SCR# 8
// DOES> TESTS
HEX
: DOES1   DOES> @ 1+ ;
: DOES2   DOES> @ 2+ ;
CREATE CR1
1 ,
;S
CR  CR1 U. HERE U.
CR  DOES1 .S
CR  CR1 .S
CR  DOES2 .S
CR  CR1 .S
 OK
8 LOAD  OK
SEE CR1
CR1
CREATE 2357
PFA: 589D    2 OK
CR CR1 U.
589D  OK
HERE U. 589F  OK
CR DOES1 .S
EMPTY  OK
CR CR1 .S
    2  OK
CR DOES2 .S
    2  OK
CR CR1 .S
    2     3  OK
. . 3 2  OK
.S EMPTY  OK
SEE CR1
CR1
DOES2 588C
PFA: 589D    2 OK
: NOTHING CREATE DOES> U. ;  OK
NOTHING ZILCH  OK
SEE ZILCH
ZILCH
 ' NOTHING >BODY 4 +
 58AF 2475    JMP ' DOES> >BODY 15 +
 58B2 1DDD U.
 58B4  94F EXIT
 OK
10 ALLOT  OK
SEE ZILCH
ZILCH
NOTHING 58AF
PFA: 58C0   10 OK
CONSOLE

And some deferred words:
Code:
SEE RR/W
RR/W DEFERED TO (RR/W) 3D03  OK
SEE DR/W
DR/W DEFERED TO (DR/W) 28B6  OK
SEE VALID?
VALID? DEFERED TO (VALID?) 1B7E  OK
SEE ERR
ERR DEFERED TO NOOP C39  OK
SEE INITIAL
INITIAL DEFERED TO NOOP C39  OK


Top
 Profile  
Reply with quote  
PostPosted: Sun Mar 31, 2019 8:25 pm 
Offline

Joined: Fri May 05, 2017 9:27 pm
Posts: 895
GARTHWILSON wrote:
The ASCII part is of course for if it's part of a string.

Do you mean an inline string literal as in the string following ABORT" in this definition?
Code:
: ?PAIRS  ( N1 N2 -- )
   <>
   ABORT" STRUCTURE MISMATCH" ;

Or even this?
Code:
: .DERR  ( -- )
   WHERE
   CR ." DISK ERROR:" CR 200 $? ;

Here is what they look like when decompiled:
Code:
SEE ?PAIRS
?PAIRS
 2234 11CB <>
 2236 1F96 (ABORT") STRUCTURE MISMATCH
 224B  950 EXIT
 OK

SEE .DERR
.DERR
 2653 1EE4 WHERE
 2655 1C83 CR
 2657 1B1F (.") DISK ERROR:
 2665 1C83 CR
 2667  825 LIT 200
 266B 1F83 $?
 266D  950 EXIT
 OK


Top
 Profile  
Reply with quote  
PostPosted: Thu Mar 23, 2023 12:40 am 
Offline

Joined: Fri May 05, 2017 9:27 pm
Posts: 895
JimBoyd wrote:
JimBoyd wrote:
Here are my criteria for a useful SEE:
1 Display what was compiled for Forth words and code words.
2 Recognize the built in data types ( VARIABLE , CONSTANT , 2VARIABLE , 2CONSTANT , and USER ) and display their values.
I've given that second criterion some thought and simplified my SEE . It still has the same features for high level and code decompiling/disassembling, but SEE now tests for the following types:
1 code words -- disassemble as before
2 high level words -- decompile as before
Both still transition between code and high level as needed.
3 deferred words -- show what the 'action' is but don't disassemble it. Show its name and address of its CFA.
4 All others are assumed to be CREATE DOES> or CREATE ;CODE words.
For these words it finds the size of the PFA.
A) If it is zero, disassemble the code pointed to by the CFA.
B) If it is not zero, show the name of the word which sets this word's code field and the address pointed to by the CFA. show the address and size of the PFA.
I've simplified it further. For any word which is not a code word or high level word or deferred word, the name of the word where the CFA points (as well as its contents) and the size of the PFA are displayed.
Some excerpts from a print dump of SEEALL , a word to SEE all the words in the CONTEXT VOCABULARY .
Code:
1
CFA>  3153 ' (FIND) >BODY 88 +
PFA:  3301     0

0
CFA>  3189 ' (FIND) >BODY 124 +
PFA:  3295     0

FALSE
CFA>  3189 ' (FIND) >BODY 124 +
PFA:  3289     0

TRUE
CFA>  3166 ' (FIND) >BODY 101 +
PFA:  3279     0

 . . .

>IN
CFA>  9401 ' CONSTANT >BODY 6 +
PFA:  2554     2

BLK
CFA>  9055 ' CREATE >BODY 151 +
PFA:  2542     4

FENCE
CFA>  9055 ' CREATE >BODY 151 +
PFA:  2532     2

 . . .

(KEY)
  3757  3696
  3759  3735 ?KEY
  3761  2469 EXIT
6

?KEY
  3737   141    STX XSAVE
  3739 65508    JSR
  3742   141    LDX XSAVE
  3744  3155    JMP APUSH
10

 . . .

KEY DEFERED TO (KEY) 3755

PAUSE DEFERED TO NOOP 3367


Notice in the decompilation for (KEY) there is no name for the word at address 3696. It is headerless.
This is the final form of my version of SEE .


Top
 Profile  
Reply with quote  
PostPosted: Fri Mar 24, 2023 5:43 pm 
Offline

Joined: Sun May 13, 2018 5:49 pm
Posts: 255
JimBoyd wrote:
I feel that SEE can be a useful tool if, instead of attempting to show the source of a Forth word, it shows what actually got compiled.
In TaliForth2 (an STC Forth that supports native compiling (inlining) of short words), that's extra important. Short words often have their assembly "native compiled" (eg. copy/pasted, similar to inlining or using a macro) into the word being compiled, but long words are compiled as a JSR. This results in a mix of assembly and JSRs to words. Here's a quick example:
Code:
: square dup * ." The answer is " . ;  ok
5 square The answer is 25  ok
Code:
see square
nt: 800  xt: 80E
flags (CO AN IM NN UF HC): 0 0 0 1 0 1
size (decimal): 51

080E  20 0A D8 CA CA B5 02 95  00 B5 03 95 01 20 0F D8   ....... ..... ..
081E  20 D7 A5 E8 E8 4C 34 08  54 68 65 20 61 6E 73 77   ....L4. The answ
082E  65 72 20 69 73 20 20 8A  A0 26 08 0E 00 20 DE A4  er is  . .&... ..
083E  20 26 8C   &.

80E   D80A jsr     STACK DEPTH CHECK
811        dex
812        dex
813      2 lda.zx
815      0 sta.zx
817      3 lda.zx
819      1 sta.zx
81B   D80F jsr     STACK DEPTH CHECK
81E   A5D7 jsr     um*
821        inx
822        inx
823    834 jmp
834   A08A jsr     SLITERAL 826 E
83B   A4DE jsr     type
83E   8C26 jsr     .
My version doesn't print out strings, but it does give you their address and length in case you want to TYPE them. Many of the built-in words have a stack depth check before they run (which can be inhibited from being compiled into new words if you want to save a few bytes when compiling known good source code). After that, you can see the assembly for DUP because it was short enough to inline, the code for * was also inlined (stack depth check, then um*, then two INX instructions to DROP the upper half of the double result from um*). The literal string data from ." is skipped over, but then SLITERAL puts the address and length on the stack and TYPEs it, and finally . is compiled directly as a JSR because it is too long to be inlined here.

This notion of having SEE show "what was actually compiled" is very important in Tali because if I disable inlining completely by setting nc-limit (native compiling limit) to zero, I get:
Code:
0 nc-limit !  ok
: square dup * ." The answer is " . ; redefined square  ok
see square
nt: 842  xt: 850
flags (CO AN IM NN UF HC): 0 0 0 1 0 1
size (decimal): 36

0850  20 9F 8D 20 3B A1 4C 67  08 54 68 65 20 61 6E 73   .. ;.Lg .The ans
0860  77 65 72 20 69 73 20 20  8A A0 59 08 0E 00 20 DE  wer is   ..Y... .
0870  A4 20 26 8C  . &.

850   8D9F jsr     dup
853   A13B jsr     *
856    867 jmp
867   A08A jsr     SLITERAL 859 E
86E   A4DE jsr     type
871   8C26 jsr     .
 ok
The string and . are handled the same, but you can now see DUP and * were just compiled as JSRs.
JimBoyd wrote:
Here are my criteria for a useful SEE:
1 Display what was compiled for Forth words and code words.
2 Recognize the built in data types ( VARIABLE , CONSTANT , 2VARIABLE , 2CONSTANT , and USER ) and display their values.
3 Display a DEFERed word's vector ( the word a DEFERed word is set to execute) .
4 Transition seamlessly from decompiling/disassembling High level code to Low level and back.
That's a pretty good list. Tali's SEE does the seamless high/low level code thing, but it really has to because Tali is an STC forth. I recently added recognition of strings because the disassembler was gacking on them (didn't know to skip over the data and tried to disassemble the string), but decided to just print the address and length of the string because I can always use a TYPE command if I need to see the string - also Tali includes a DUMP of the word and the string data is visible there if it's a constant string that was compiled directly into the word (eg. with ". or similar).
Having it show the values in constants and variables might be nice, and showing what a deferred word points to also sounds useful. I'll consider adding those to Tali's SEE.


Top
 Profile  
Reply with quote  
PostPosted: Tue Mar 28, 2023 12:44 am 
Offline

Joined: Fri May 05, 2017 9:27 pm
Posts: 895
It no longer shows the value of constants and variables. Any word which is not a code word, a high level word, or a deferred word has relatively basic information displayed as shown below.
Code:
SEE BLK
BLK
CFA>  9055 ' CREATE >BODY 151 +
PFA:  2542     4  OK

It just seemed like I was doing extra work to show the value for all the built in types: VARIABLE 2VARIABLE CONSTANT 2CONSTANT VALUE VOCABULARY . There is also SARRAY XTTABLE and AMONG , but the child words of these last three have a variable amount of data. SEE has to be aware of the internal strings compiled by " (quote) ." (dot quote) and ABORT" to skip over them, so I decided it was a good idea to go ahead and display them.
I'm satisfied with Fleet Forth's current version of SEE . Some Forths have a version of SEE which tries to reconstruct the source. This is the kind of SEE I refer to as a parlor trick. Durex Forth for the Commodore 64 has this type of SEE .
When I ported Leo Brodie's DOER/MAKE to Durex Forth, that version of SEE was useless. It was also useless when I ported halting colon definitions to Durex Forth.


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 9 posts ] 

All times are UTC


Who is online

Users browsing this forum: No registered users and 5 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: