6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Mon Apr 29, 2024 3:17 pm

All times are UTC




Post new topic Reply to topic  [ 41 posts ]  Go to page Previous  1, 2, 3
Author Message
PostPosted: Wed Dec 31, 2014 11:42 pm 
Offline
User avatar

Joined: Mon Apr 23, 2012 12:28 am
Posts: 760
Location: Huntsville, AL
Jeff:

As usual, you are likely correct on this issue, but I would not have gotten the fig-FORTH kernel running if was not relocatable. Since to be relocatable requires relative offsets, I think that is a benefit to the fig-FORTH model. Most older processors like the 6502/65C02 do not provide good support for relative addressing. What support for relative addressing they do provide is in a form that's not generally available for direct use by the FORTH VM, i.e. native 8-bit relative conditional/unconditional branch instructions.

When I looked at the BRANCH and 0BRANCH words of the fig-FORTH model, there are a lot of FORTH VM operations being performed to synthesize the relative branches used in the fig-FORTH kernel. That process is expensive in terms of native machine instruction cycles due to the repeated use of DOCOL (ENTER) and NEXT to advance the FORTH VM through the code.

Some time ago, I extended the M65C02A instruction set to include the BRA rel16 and the PHR rel16 instructions. Thus, with the incorporation of the FORTH VM registers, IP and W, (per your recommendation, or more accurately, with your prodding :D ) into the core and the pre-existing support logic for 16-bit relative addressing, all of the logic is in place in the M65C02A core to add support for IP-relative branching. IOW, I believe all that will be required are changes to the microprogram in order to use IP instead of PC for the base and as a pointer to the 16-bit offset.

I have been considering implementing this support by overloading the IND prefix instruction. If IND is applied to the native Bcc rel8 instructions, the microprogram will implement Bcc [IP++] rather than the normal Bcc [PC++].

PS: Thanks for the congratulations.

All:

Have a safe and happy New Year's and we'll communicate again next year. :)

_________________
Michael A.


Top
 Profile  
Reply with quote  
PostPosted: Thu Jan 01, 2015 5:17 pm 
Offline
User avatar

Joined: Fri Dec 11, 2009 3:50 pm
Posts: 3349
Location: Ontario, Canada
MichaelM wrote:
Have a safe and happy New Year's and we'll communicate again next year. :)
Whoops, is it "next year" already? That was fast! :wink:

MichaelM wrote:
Since to be relocatable requires relative offsets, I think that is a benefit to the fig-FORTH model.
barrym95838 wrote:
The link addresses and CFAs are all absolute
To run from a different place in memory, Forth would have to be reassembled to fix all absolute references, as Mike noted. IOW branch destinations are not the only issue. Perhaps we are using the word relocatable in not quite the same way. It seems there's some misunderstanding somewhere.

I admit being perplexed by FIG Forth's use of IP-relative addressing. Regarding relocatability, half a solution is the same as no solution. And, with relocatability ruled out, I see no benefit to justify the slight IP-relative performance hit. New 65xx implementations such as yours may reduce or eliminate the hit, and perhaps there is some value in allowing flawed legacy code to run faster. I just hope you're not devoting a lot of resources to this. It seems to me a fairly trivial rewrite of the FIG code would eliminate IP-relative addressing and its performance hit. If I'm mistaken, or if IP-relative branches offer a benefit not yet mentioned, I hope someone will point it out.

cheers,
Jeff

_________________
In 1988 my 65C02 got six new registers and 44 new full-speed instructions!
https://laughtonelectronics.com/Arcana/ ... mmary.html


Top
 Profile  
Reply with quote  
PostPosted: Tue Jul 21, 2015 11:18 pm 
Offline
User avatar

Joined: Mon Apr 23, 2012 12:28 am
Posts: 760
Location: Huntsville, AL
It's been a long time in coming, but I've finally begun testing the Forth VM support I've included in the M65C02A core. The following diagram shows the execution of two M65C02A FORTH instructions which can be used to implement a DTC FORTH EXE or EXIT functions. This functionality is implemented using a 16-bit pull instruction, PLI, and a 3-cycle NeXT instruction. :D
Attachment:
File comment: Forth VM PLI NXT instructions
ForthVM-DTC_NEXT.JPG
ForthVM-DTC_NEXT.JPG [ 209.72 KiB | Viewed 1503 times ]

The test code is a simple DTC FORTH-like Word code sequence where the code field address is populated by the 65C02 BRA rel instruction, so it's only two bytes. The CFA is at address $F22C, and IP is pointing to address $F228.
Code:
F222: F428F2                phw #$F228                  ; test DTC EXE
F225: 6B                    pli
F226: 7B                    nxt
F227: DB                    stp
F228: 2CF2                  dw  *+4
F22A: 0000                  brk #0
F22C: 8000                  bra *+2
F22E: A9FF                  lda #$FF
F230: 8D0002                sta $200
;
F233: EA                    nop

_________________
Michael A.


Top
 Profile  
Reply with quote  
PostPosted: Wed Jul 22, 2015 4:02 pm 
Offline
User avatar

Joined: Fri Dec 11, 2009 3:50 pm
Posts: 3349
Location: Ontario, Canada
3-cycle Next, eh? :D And pli is handy for popping a value into IP. I see the code snippet tests pil and nxt.

Quote:
the code field address is populated by the 65C02 BRA rel instruction, so it's only two bytes.
I'm not clear why you began the executable code with the BRA. The following lda and sta seem meaningful as a placeholder to represent hypothetical user code. But why the BRA? Does it assist with testing?

cheers,
Jef

_________________
In 1988 my 65C02 got six new registers and 44 new full-speed instructions!
https://laughtonelectronics.com/Arcana/ ... mmary.html


Top
 Profile  
Reply with quote  
PostPosted: Wed Jul 22, 2015 5:11 pm 
Offline
User avatar

Joined: Sun Jun 30, 2013 10:26 pm
Posts: 1927
Location: Sacramento, CA, USA
Yeah, when I switched my 65m32 Forth from ITC to DTC, starting my machine code directly in (at?) the CFA was a nice space saver (and time saver) since many of my primitives are only two or three machine instructions. scotws seems to be doing the BRA thing in his 65c02 STC Forth as well ... maybe he could provide some insights?

Although I didn't directly intend it to be that way, it seems that many 6809 coding strategies can be applied to my 65m32 with similar benefit. I still prefer the 65xx over the 68xx, for other reasons.

Mike B.

[Edit: scotws explains his BRA strategy a bit here.
]


Top
 Profile  
Reply with quote  
PostPosted: Wed Jul 22, 2015 10:25 pm 
Offline
User avatar

Joined: Mon Apr 23, 2012 12:28 am
Posts: 760
Location: Huntsville, AL
Dr Jefyll wrote:
I'm not clear why you began the executable code with the BRA. The following lda and sta seem meaningful as a placeholder to represent hypothetical user code. But why the BRA? Does it assist with testing?
As I've said before, I'm a complete noobie when it comes to Forth. In my attempt to incorporate some special support for FORTH, I decided to use the prefix instructions to support both an ITC and a DTC VM. As such, I followed a model where the parameter field of a primitive or secondary requires two bytes. Therefore, in my test case, I am just retaining the code field as a reminder of where the ITC CFA pointer must be inserted.I simply used the bra *+2 as a two byte place holder for the code field; I kind of like how the machine code representation of that instruction, $80 $00, looks in the instruction stream. :) When I set up the test for the DTC ENTER instruction, the bra *+2 will be replaced with either the single byte (DTC) ent ($7B) instruction.

Both you and Mike are correct in your assessment that a primitive DTC FORTH word could be implemented with machine code in the code field. The two cycles saved by having no code field in a DTC FORTH word probably should be the preferred implementation.

I hope to complete the testing of the FORTH VM support instructions soon: nxt, pli, ini, phi, ent, and lda (ip,I++). The ITC version of the fundamental operations will require the ind prefix instruction: ENTER => ind ent; NEXT => ind nxt; EXE => pli ind nxt; EXIT => pli ind nxt;

The auxiliary stack usage may add another prefix instruction (osx) to the instruction sequences in either a DTC or an ITC implementation. With the osx prefix, the auxiliary stack, accessed using X, can be used for the FORTH VM's Return Stack (RS), and the system stack can be used as the parameter stack (PS).

_________________
Michael A.


Top
 Profile  
Reply with quote  
PostPosted: Wed Jul 22, 2015 11:57 pm 
Offline
User avatar

Joined: Fri Dec 11, 2009 3:50 pm
Posts: 3349
Location: Ontario, Canada
barrym95838 wrote:
scotws explains his BRA strategy a bit here.]
Thanks, Mike. And I hope I didn't seem critical, Michael. The truth is I'm rather a noob myself when it comes to Forth other than ITC. This notion with DTC of starting the code with a jump or bra seems like it might serve a purpose -- a substitute form of indirection, perhaps --but I hesitate to speculate. If you got the idea from something you read, I'd be interested to hear if that source also offers an explanation.

_________________
In 1988 my 65C02 got six new registers and 44 new full-speed instructions!
https://laughtonelectronics.com/Arcana/ ... mmary.html


Top
 Profile  
Reply with quote  
PostPosted: Thu Jul 23, 2015 12:18 am 
Offline
User avatar

Joined: Mon Apr 23, 2012 12:28 am
Posts: 760
Location: Huntsville, AL
Not at all, Jeff. I simply have on OCD-like streak :) that goes for a certain amount of symmetry: a two byte branch and a two byte ENTER sequence just appealed to me. I used the branch instruction to enter/start the simulated primitive because an ENTER into a secondary would require 1 byte if S => RS, or 2 bytes if X => RS. I'd like to imagine that when I got around to optimizing for performance, I would consider eliminating the code field to get back the two cycles of the bra *+2 sequence.

However, being a Forth noob, I might keep the code field because Brad Rodriguez, in his Moving Forth articles, emphasized the importance of having W point to the code field. When he recommended having W pointing to the code field, I don't think he made a distinction between ITC or DTC implementations, or between primitive and secondary words. Since I do not have a full understanding of Forth's fine points, I will try to stay on the path described by Brad and others for the time being.

_________________
Michael A.


Top
 Profile  
Reply with quote  
PostPosted: Thu Jul 23, 2015 1:31 am 
Offline
User avatar

Joined: Sat Sep 29, 2012 10:15 pm
Posts: 899
Bravo, Michael. I can't wait to try it out.

_________________
In theory, there is no difference between theory and practice. In practice, there is. ...Jan van de Snepscheut


Top
 Profile  
Reply with quote  
PostPosted: Thu Jul 23, 2015 1:07 pm 
Offline
User avatar

Joined: Mon Apr 23, 2012 12:28 am
Posts: 760
Location: Huntsville, AL
I have tested a number of other instructions: INI, INW (IND INI), PLW (IND PLI), and ENT (ENTER). In the attached figure, I've annotated the PLW instruction followed by the ENT instruction. External behavior are as I expect, and shows that the DTC ENT instruction requires only 5 cycles. :) Only need to verify the ITC mode of the NXT and ENT instructions and the LDA ip,I++ instruction before declaring the ForthVM module tested.
Attachment:
File comment: DTC Forth VM ENTER (DOCOLON)
ForthVM-DTC_ENTER.JPG
ForthVM-DTC_ENTER.JPG [ 216.12 KiB | Viewed 1429 times ]


Edit: removed indirection parentheses around ip,I++ operand. IND, SIZ, ISZ prefixes are supported for this instruction. MAM, 15K15.

_________________
Michael A.


Last edited by MichaelM on Sun Nov 15, 2015 2:16 pm, edited 1 time in total.

Top
 Profile  
Reply with quote  
PostPosted: Fri Jul 24, 2015 1:10 pm 
Offline
User avatar

Joined: Mon Apr 23, 2012 12:28 am
Posts: 760
Location: Huntsville, AL
The attached figure shows the LDA ip,I++ Forth VM instruction used to load the accumulator with a 16-bit in-line literal such as used in the LIT and CLIT figFORTH words. The instruction autoincrements IP past the embedded literal/constant so that interpretation can continue seemlessly. Currently, the offset operand is ignored, but can be readily implemented with a simple microcode change rather than a logic change.

Following the LDA ip,I++ instruction is a 16-bit comparison instruction. I've implemented a change to the normal behavior of the CMP/CPX/CPY instructions such that 16-bit comparisons affect the V flag. This change enables the implementation of more powerful conditional branch instructions. PDP11-like signed and unsigned conditional branches will be performed when the normal conditional branch instructions are preceded by the SIZ prefix instruction. I've also set it up such that if the conditional branch instructions are preceded by the ISZ prefix instruction, then the conditional branch is relative to IP. I expect these two simple modifications to significantly improve the efficiency of comparisons for FORTH and standard languages.
Attachment:
File comment: IP-relative LDA with autoincrement
ForthVM-LDA_(ip,I++).JPG
ForthVM-LDA_(ip,I++).JPG [ 205.9 KiB | Viewed 1406 times ]


Edit: removed indirection parentheses around ip,I++ operand. IND, SIZ, ISZ prefixes are supported for this instruction. MAM, 15K15.

_________________
Michael A.


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 41 posts ]  Go to page Previous  1, 2, 3

All times are UTC


Who is online

Users browsing this forum: No registered users and 9 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
cron