6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Sun Apr 28, 2024 4:17 am

All times are UTC




Post new topic Reply to topic  [ 17 posts ]  Go to page 1, 2  Next
Author Message
PostPosted: Sun Nov 09, 2014 10:18 pm 
Offline

Joined: Mon Jan 07, 2013 2:42 pm
Posts: 576
Location: Just outside Berlin, Germany
So one of the things that will make you want to bang your head on the table about Forth is that between FIG and F83, they changed the way loops will work. These days, if you create a loop such as 0 0 DO, it will go through the complete word size instead of just quitting. Formally, the loop is complete when the boundry between limit-1 and limit is crossed, or something to that effect. Seriously, why?

Whatever. Mike had pointed out that there is a clever way to code this using the Overflow Flag (V) on the 65m32 (see http://forum.6502.org/viewtopic.php?f=9&t=2026). I've started to create versions for Tali Forth (which up till now used the FIG Forth loops) on the 65c02. In case anybody has the same problem, these are the code snippets for (DO), I, J, and (LOOP). I'll post (+LOOP) when I get around to writing it. Note these routines only have had minimal testing so far, so I'm not sure about the edge cases.

First, (DO)

Code:
l_pdo:          bra a_pdo
                .byte CO+NC+$04
                .word l_i    ; link to I
                .word z_pdo
                .byte "(DO)"
.scope
a_pdo:          ; first step: create fudge factor (FUFA) by subtracting the limit
                ; from $8000, the number that will trip the overflow flag
                sec
                lda #$00
                sbc 3,x         ; LSB of limit
                sta 3,x         ; save FUFA for later use
                lda #$80
                sbc 4,x         ; MSB of limit
                sta 4,x         ; save FUFA for later use
                pha             ; FUFA replaces limit on R stack
                lda 3,x         ; LSB of limit
                pha

                ; second step: index is FUFA plus original index
                clc
                lda 1,x         ; LSB of original index
                adc 3,x         ; add LSB of FUFA
                sta 1,x
                lda 2,x         ; MSB of orginal index
                adc 4,x         ; add MSB of FUFA
                pha
                lda 1,x         ; LSB of index
                pha

                ; we've saved the FUFA on the NOS of the R stack, so we can
                ; use it later
                inx
                inx
                inx
                inx

z_pdo:          rts

Then I and J:

Code:
l_i:            bra a_i
                .byte NC+CO+$01
                .word l_j    ; link to J
                .word z_i
                .byte "I"
.scope
a_i:            dex
                dex

                ; get the fudged index off of the top of the stack. it's
                ; easier to do math on the stack directly than to pop and
                ; push stuff around
                stx TMPX
                tsx

                sec
                lda $0101,x     ; LSB
                sbc $0103,x
                sta TMPCNT

                lda $0102,x     ; MSB
                sbc $0104,x

                ldx TMPX

                sta 2,x         ; MSB of de-fudged index
                lda TMPCNT
                sta 1,x         ; LSB of de-fudged index

z_i:            rts             ; should be never reached, because NC


The code for J is basically the same except that the addresses on the stack are each four bytes down. Finally, (LOOP):

Code:
l_ploop:        bra a_ploop
                .byte NC+CO+$06
                .word l_abs     ; link to ABS
;               .word l_pploop    ; link to PPLOOP # TODO change
                .word z_ploop
                .byte "(LOOP)"
.scope
a_ploop:        ; TOP of the Return Stack has the index. We manipulate the
                ; 65c02 stack in place.
                stx TMPX
                tsx

                clc
                lda $0101,x     ; LSB
                adc #$01
                sta $0101,x

                clv             ; we check the V flag on MSB
                lda $0102,x     ; MSB
                adc #$00        ; we only care about the carry       
                sta $0102,x

                ldx TMPX

                bvs _hack+3     ; skip over JMP instruction

_hack:          ; This is why this routine must be natively compiled: We
                ; compile the opcode for JMP here without an address to
                ; go to, which is added by the next address by LOOP.
                .byte $4C

z_ploop:        rts             ; never reached

The acutal DO and LOOP words are high-level:

Code:
: DO POSTPONE (DO) HERE ; IMMEDIATE COMPILE-ONLY
: LOOP POSTPONE (LOOP) , POSTPONE UNLOOP ; IMMEDIATE COMPILE-ONLY

If you want to know why 32 bits make life easier, compare how long this code is to what Mike wrote :-) .


Top
 Profile  
Reply with quote  
PostPosted: Mon Nov 10, 2014 11:42 pm 
Offline

Joined: Mon Jan 07, 2013 2:42 pm
Posts: 576
Location: Just outside Berlin, Germany
And heeeere's +LOOP:
Code:
l_pploop:       bra a_pploop
                .byte NC+CO+$07
                .word l_abs     ; link to ABS
                .word z_pploop
                .byte "(+LOOP)"
.scope
a_pploop:       clc
                pla             ; LSB of index
                adc 1,x         ; LSB of step
                tay             ; temporary storage of LSB

                clv
                pla             ; MSB of index
                adc 2,x         ; MSB of step
                pha             ; put MSB of index back on stack

                tya             ; put LSB of index back on stack
                pha

                inx             ; dump step from TOS
                inx

                bvs _hack+3     ; skip over JMP instruction

_hack:          ; This is why this routine must be natively compiled: We
                ; compile the opcode for JMP here without an address to
                ; go to, which is added by the next address after +LOOP
                .byte $4C

z_pploop:       rts             ; never reached


Top
 Profile  
Reply with quote  
PostPosted: Tue Nov 11, 2014 12:27 am 
Offline
User avatar

Joined: Sun Jun 30, 2013 10:26 pm
Posts: 1927
Location: Sacramento, CA, USA
Look's pretty tidy Scot! My only nit to pick is your unnecessary use of CLV before ADC ... at least I'm pretty sure that it's unnecessary! You are progressing more quickly than I am in learning Forth ... I have no clue what your high-level words are doing (at least how they're doing it).

Mike


Top
 Profile  
Reply with quote  
PostPosted: Tue Nov 11, 2014 11:52 am 
Offline

Joined: Mon Jan 07, 2013 2:42 pm
Posts: 576
Location: Just outside Berlin, Germany
Thanks! When I find the time, I'll see if the variant using direct manipulation of the Stack Pointer as in (LOOP) or the TAY version as in (+LOOP) is better; I'm tending towards the later. To be honest, I'm not sure about the CLV, but wanted to be safe.

As for the high-level commands -- I went crazy trying to understand POSTPONE at first. It doesn't help that it does different things based on if the command is IMMEDIATE or normal. What happens (and I am almost completely sure of this) is that at run-time, the POSTPONE causes the code of (DO) to be copied into the new word even though it is actually IMMEDIATE. Then HERE is executed because of the IMMEDIATE, leaving the current address cell on the Data Stack. So that is DO.

When it is time for LOOP, the code of (LOOP) is again copied into the new word and not executed because of POSTPONE, while the , ("comma") causes the address that DO put on the Data Stack to be saved right after the assembled (LOOP) code. POSTPONE UNLOOP just adds the code to drop the top two entries (four bytes) off the Return Stack -- the Index and Limit of the loop.

The part that took me so long to understand is that (LOOP) just compiles the opcode of the JMP instruction -- but not the operand! -- into place. The address is provided by the HERE instruction of (DO) in combination with the , ("comma") later. It only really clicked when I DUMPed the compiled word and went through a hard copy of the machine code with a pencil. But don't tell that to anybody.

The drawback is that I don't see a way to easily reuse the code of say (LOOP) for (+LOOP) because a BRA wouldn't have the right target. So there is a certain amount of duplication here. Once I get everything coded -- I'm still missing a bunch of very basic stuff like LEAVE -- I'll go back and try to figure a way around this. It seems silly to have so much code twice.


Top
 Profile  
Reply with quote  
PostPosted: Tue Nov 11, 2014 2:20 pm 
Offline

Joined: Tue Jan 07, 2014 8:40 am
Posts: 91
scotws wrote:
I went crazy trying to understand POSTPONE at first. It doesn't help that it does different things based on if the command is IMMEDIATE or normal. What happens (and I am almost completely sure of this) is that at run-time, the POSTPONE causes the code of (DO) to be copied into the new word even though it is actually IMMEDIATE. Then HERE is executed because of the IMMEDIATE, leaving the current address cell on the Data Stack. So that is DO.


Your understanding is correct. DO is executed during compilation because it is an IMMEDIATE word. Inside DO, POSTPONE causes (DO) to be compiled into the new word. Then HERE is executed, leaving an address on the stack.

If you're familiar with fig-Forth, here is one way to think of POSTPONE:
POSTPONE, if applied to an IMMEDIATE word, does the action of [COMPILE].
POSTPONE, if applied to a non-IMMEDIATE word, does the action of COMPILE.

So, on an indirect-threaded Forth,
POSTPONE DO will cause the cfa of DO to be compiled into this word, even though DO is IMMEDIATE. (Strictly speaking, the function of POSTPONE is to cause this word to perform the compile-time action of DO.)
POSTPONE (DO) will cause the cfa of (DO) to be compiled into the new word being defined.

Since POSTPONE is performing a compilation function, any word that uses POSTPONE should be an IMMEDIATE word (executed during compilation).

_________________
Because there are never enough Forth implementations: http://www.camelforth.com


Top
 Profile  
Reply with quote  
PostPosted: Tue Nov 11, 2014 4:39 pm 
Offline
User avatar

Joined: Thu May 28, 2009 9:46 pm
Posts: 8147
Location: Midwestern USA
scotws wrote:
To be honest, I'm not sure about the CLV, but wanted to be safe.

In this case it isn't necessary.

_________________
x86?  We ain't got no x86.  We don't NEED no stinking x86!


Top
 Profile  
Reply with quote  
PostPosted: Wed Nov 12, 2014 2:26 am 
Offline
User avatar

Joined: Sun Jun 30, 2013 10:26 pm
Posts: 1927
Location: Sacramento, CA, USA
scotws wrote:
... The drawback is that I don't see a way to easily reuse the code of say (LOOP) for (+LOOP) because a BRA wouldn't have the right target. So there is a certain amount of duplication here. Once I get everything coded -- I'm still missing a bunch of very basic stuff like LEAVE -- I'll go back and try to figure a way around this. It seems silly to have so much code twice.

I'm not sure that my idea would work well on the 'c02 (actually I'm not even sure that it'll work well on the 'm32 yet), but I found that the easiest way to avoid repetition in the primitives for (LOOP) and (+LOOP) was to code (LOOP) as (1 +LOOP). Pushing a 1 on the dstack isn't an expensive operation on the 'm32, and I've been concentrating on code density over speed anyway. Am I missing something important with respect to their immediate counterparts, and the way in which they make use of these run-time versions?

Mike

[I think that your (+LOOP) looks tidier than your (LOOP), all other things being equal ...]


Top
 Profile  
Reply with quote  
PostPosted: Wed Nov 12, 2014 1:17 pm 
Offline

Joined: Mon Jan 07, 2013 2:42 pm
Posts: 576
Location: Just outside Berlin, Germany
barrym95838 wrote:
(...) but I found that the easiest way to avoid repetition in the primitives for (LOOP) and (+LOOP) was to code (LOOP) as (1 +LOOP).

Ah. I, well, uh ... wait ...
Code:
 ": LOOP POSTPONE 1 POSTPONE (+LOOP) , POSTPONE UNLOOP ; IMMEDIATE COMPILE-ONLY

Works perfectly, and we can ditch (LOOP) completely. Now I feel like an idiot, but the code is better :-) . Thanks!


Top
 Profile  
Reply with quote  
PostPosted: Wed Nov 12, 2014 4:17 pm 
Offline
User avatar

Joined: Sun Jun 30, 2013 10:26 pm
Posts: 1927
Location: Sacramento, CA, USA
Sometimes a student can help another student more easily than the teacher can, but I hesitate to claim this small victory until a professor wanders by and gives our idea a "thumbs-up". It looks like a classic trade-off of compactness vs. speed, and I almost always lean toward the former when push comes to shove.

Mike

[I am really enjoying your company on this little Forth "adventure of discovery", and I selfishly encourage you to keep us up to date on your journey.]


Last edited by barrym95838 on Wed Nov 12, 2014 4:56 pm, edited 1 time in total.

Top
 Profile  
Reply with quote  
PostPosted: Wed Nov 12, 2014 4:55 pm 
Offline

Joined: Tue Jan 07, 2014 8:40 am
Posts: 91
That should certainly work, provided that you have defined 1 as a Forth word (e.g. a Forth CONSTANT). If 1 is normally compiled as a numeric literal, you'd have to do something like POSTPONE LIT 1 , (where LIT is the run-time code for numeric literals).

1 is so commonly used, though, that most Forths do define it as a constant.

_________________
Because there are never enough Forth implementations: http://www.camelforth.com


Top
 Profile  
Reply with quote  
PostPosted: Wed Nov 12, 2014 11:23 pm 
Offline

Joined: Mon Jan 07, 2013 2:42 pm
Posts: 576
Location: Just outside Berlin, Germany
It actually looks like this:
Code:
; ONE ( -- 1 ) ("1")
; Commonly used number, hard-coded for speed

l_one:          bra a_one
                .byte NC+$01
                .word l_zero    ; link to ZERO ("0")
                .word z_one
                .byte "1"

a_one:          dex
                dex
                lda #$01
                sta 1,x         ; LSB
                stz 2,x         ; MSB

z_one:          rts

I've included 0 and 2 as well.


Top
 Profile  
Reply with quote  
PostPosted: Thu Nov 13, 2014 12:15 am 
Offline
User avatar

Joined: Sun Jun 30, 2013 10:26 pm
Posts: 1927
Location: Sacramento, CA, USA
I have noticed that your PSP (somewhat unusually) points to the next open stack byte, like the RSP (6502-style). Was this intentional, or did it just happen that way?

The 'm32 does it with a
Code:
        sta ,-x
        lda #1

because a holds TOS and x points directly at NOS, very similarly to the way a 6809 would likely do it (with D and U).

Mike


Top
 Profile  
Reply with quote  
PostPosted: Thu Nov 13, 2014 1:22 pm 
Offline

Joined: Mon Jan 07, 2013 2:42 pm
Posts: 576
Location: Just outside Berlin, Germany
I did that so I don't get confused about which stack has what structure, so yes, the criterium was "the same as the system stack". Farely early on, I made this Ascii drawing of the Data Stack:
Code:
          $00  +---------------+  <-- SPMAX
               |           ... |
               +-             -+
               |               |  $FE,x
               +-   (Empty)   -+
               |               |  $FF,x
               +-             -+
               |               |  <-- Stack Pointer (X Register)
               +===============+
               |              L|  $1,x
               +-    Cell     -+
               |              M|  $2,x
               +---------------+
               |              L|  $3,x
               +-    Cell     -+
          $7F  |              M|  $4,x   <-- SP0
               +===============+

As a style rule, I try move the Stack Pointer before I access the data so I don't have code like STA $FF,X . If I remember correctly, FIG Forth does that at various times, and I found it confusing at first.


Top
 Profile  
Reply with quote  
PostPosted: Thu Nov 13, 2014 11:07 pm 
Offline
User avatar

Joined: Fri Dec 11, 2009 3:50 pm
Posts: 3346
Location: Ontario, Canada
scotws wrote:
As a style rule, I try move the Stack Pointer before I access the data so I don't have code like STA $FF,X . If I remember correctly, FIG Forth does that at various times, and I found it confusing at first.
It's true that in a few places 6502 Fig is coded to look at the cell that is "topper" than top of stack. :D That trick can save some bytes and cycles, and in the original context it's 100% reliable -- for example it's not like the hardware stack, which might get borrowed and written to at any time due to an interrupt. AFAIK the only pitfall is for an '816 in Native Mode, which won't "wrap" the $FF,X access within page zero as intended but will go into page 1 instead. Your decision to always move the Stack Pointer (X, that is) first eliminates even that unlikely issue.

As for having X indicate the first byte free (rather than the first byte used), that wouldn't be my choice but it does offer consistency with the quirky 6502 stack. (I'd rather deal with one quirky stack than two.) Just a reminder -- make sure your policy is duly accounted for in the word SP@ if you have it.


Brad R wrote:
If you're familiar with fig-Forth, here is one way to think of POSTPONE:
POSTPONE, if applied to an IMMEDIATE word, does the action of [COMPILE].
POSTPONE, if applied to a non-IMMEDIATE word, does the action of COMPILE
Thanks, Brad. Fig is a good point of reference, and I appreciate the clarification. :)

- Jeff
[Edit: "access within page zero as intended but will go into page 1 instead"]

_________________
In 1988 my 65C02 got six new registers and 44 new full-speed instructions!
https://laughtonelectronics.com/Arcana/ ... mmary.html


Last edited by Dr Jefyll on Fri Nov 14, 2014 5:55 am, edited 1 time in total.

Top
 Profile  
Reply with quote  
PostPosted: Fri Nov 14, 2014 5:37 am 
Offline
User avatar

Joined: Thu May 28, 2009 9:46 pm
Posts: 8147
Location: Midwestern USA
Dr Jefyll wrote:
AFAIK the only pitfall is for an '816 in Native Mode, which won't "wrap" the $FF,X access within page 1 as intended but will go into page 2 instead.

You mean page zero and page 1?

_________________
x86?  We ain't got no x86.  We don't NEED no stinking x86!


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 17 posts ]  Go to page 1, 2  Next

All times are UTC


Who is online

Users browsing this forum: No registered users and 5 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: