Using the Y-reg as the IP ptr

Topics relating to various Forth models on the 6502, 65816, and related microprocessors and microcontrollers.
User avatar
barrym95838
Posts: 2056
Joined: 30 Jun 2013
Location: Sacramento, CA, USA

Re: Using the Y-reg as the IP ptr

Post by barrym95838 »

Go Rob, go! And please share along the way.

Charlie keeps his PETTIL headers separate from his code in his DTC Forth, but I haven't thoroughly examined the pros and cons of that strategy.
Got a kilobyte lying fallow in your 65xx's memory map? Sprinkle some VTL02C on it and see how it grows on you!

Mike B. (about me) (learning how to github)
User avatar
GARTHWILSON
Forum Moderator
Posts: 8773
Joined: 30 Aug 2002
Location: Southern California
Contact:

Re: Using the Y-reg as the IP ptr

Post by GARTHWILSON »

Should the LDY $C0DE be LDY #$C0DE? Even if you put this in direct page, it's still no fewer cycles than mine, but if one of the registers were available, you could speed it up by replacing the pair of INC NEXT+1's (14 cycles) with LD_, IN_, IN_, ST_ (12 cycles), where the _ gets replaced with A, X, or Y.
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?
User avatar
barrym95838
Posts: 2056
Joined: 30 Jun 2013
Location: Sacramento, CA, USA

Re: Using the Y-reg as the IP ptr

Post by barrym95838 »

The $C0DE vs. #$C0DE is the thing that confuses me the most, and I always seem to pick the wrong version first when I'm in the flow. I don't have time now, but I'll brain-simulate it later and figure it out, if no one else beats me to it.

Sometimes that little # or a set of parentheses is the only fundamental difference between ITC and DTC.
Got a kilobyte lying fallow in your 65xx's memory map? Sprinkle some VTL02C on it and see how it grows on you!

Mike B. (about me) (learning how to github)
IamRob
Posts: 357
Joined: 26 Apr 2020

Re: Using the Y-reg as the IP ptr

Post by IamRob »

GARTHWILSON wrote:
Should the LDY $C0DE be LDY #$C0DE? Even if you put this in direct page, it's still no fewer cycles than mine, but if one of the registers were available, you could speed it up by replacing the pair of INC NEXT+1's (14 cycles) with LD_, IN_, IN_, ST_ (12 cycles), where the _ gets replaced with A, X, or Y.
I didn't even notice the missing "#" sign in Mikes code, but I did have it corrected in mine.

Although the pair of INC NEXT's are 2 cycles shorter, they are also 2 bytes larger. And my Diretct page is filling up fast. I have replaced 7x JMP NEXT with BRA NEXT, and hope to save more by putting more routines into the Direct Page, so for now need all the space I can get.
IamRob
Posts: 357
Joined: 26 Apr 2020

Re: Using the Y-reg as the IP ptr

Post by IamRob »

barrym95838 wrote:
Go Rob, go! And please share along the way.

Charlie keeps his PETTIL headers separate from his code in his DTC Forth, but I haven't thoroughly examined the pros and cons of that strategy.
The largest PRO that I am getting right now, is that I am able to change a lot of JMP's to BRA's when the headers makes the code too far apart.

One of the biggest advantages of keeping the TOS in both the Acc and Y-reg when entering a word definition is so you can go either way and just re-use the one that is not needed. Check out doFETCH and doSWAP.

This is the code going into my Direct Page.

Code: Select all

doFETCH	LDA $0000,Y
	BRA NEXT

doSWAP	LDA $0,X
	STY $0,X
	BRA NEXT

3+	INA
2+	INA
1+	INA
	BRA NEXT

3-	DEA
2-	DEA
1-	DEA
	BRA NEXT

doCOLON	LDA NEXT+1
	PHA
	LDA (NEXT+1)
	STA NEXT+1
	TYA                     ; this is another advantage of having TOS in both Acc and Y-reg.  You don't have to do PHA/PLA
	BRA NEXT

doSEMIS	PLY
	STY NEXT+1

BUMP	INC NEXT+1
	INC NEXT+1

NEXT	LDY #0000	; IP pointer
	STY W+1                  ; I like to use registers up as quick as possible to show they are immediately available

	INC NEXT+1
	INC NEXT+1

	TAY
W	JMP ($0000)
Last edited by IamRob on Wed Dec 29, 2021 5:52 am, edited 3 times in total.
User avatar
GARTHWILSON
Forum Moderator
Posts: 8773
Joined: 30 Aug 2002
Location: Southern California
Contact:

Re: Using the Y-reg as the IP ptr

Post by GARTHWILSON »

Quote:
Although the pair of INC NEXT's are 2 cycles shorter,
Each INC DP in 16-bit is 6 cycles, 7 if DP is not page-aligned; and INC DP,X is 8 and 9. That's why I opted to bring it into a register and do a pair in increments on the register and store it back, which is faster.
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?
User avatar
barrym95838
Posts: 2056
Joined: 30 Jun 2013
Location: Sacramento, CA, USA

Re: Using the Y-reg as the IP ptr

Post by barrym95838 »

IamRob wrote:

Code: Select all

...
1-	DEA
2-	DEA
3-	DEA
	BRA NEXT

doCOLON	LDY NEXT+1
	PHY
	LDA (NEXT+1)
	STA NEXT+1
	BRA NEXT
...
I think you got your labels backward for the DEAs, and you probably need to stash A somewhere before you use it in doCOLON because it's TOS.
[Gratuitous vapor-ware example from my 65m32a DTC Forth: a is TOS, s is DSP, x is RSP, y is IP, e is PC:

Code: Select all

                 `  169 `--------------------------------------------------------------- enter
                 `  170 enter_: ` ( R: -- a ) enter a new thread -- called with "jsr enter_"
00000040:9102c000`  171     sly ,-x         ` push old IP on R: and pop new IP (return addr
00000041:ae018000`  172     lde ,y+         `    from caller's "jsr enter_" instruction)
                 `  173 `---------------------------------------------------------------- EXIT
00000042:00000000`  174     NOT_IMM 'EXIT'
00000043:04455849`  174 
00000044:54000000`  174 
                 `  175 _exit: ` ( R: a -- ) exit the current high-level thread
00000045:a1028000`  176     ldy ,x+         ` pop IP from R:
00000046:ae018000`  177     lde ,y+         ` NEXT aka jmp (,y+)
                 `  178 `----------------------------------------------------------------- lit
00000047:00000042`  179     NOT_IMM 'lit'
00000048:036c6974`  179 
                 `  180 _lit: ` ( -- x ) push inline literal -- primitive compiled by LITERAL
00000049:ba018000`  181     pda ,y+         ` push in-lined literal on S: and bump IP
0000004a:ae018000`  182     lde ,y+         ` NEXT aka jmp (,y+)
                 `  183 `-------------------------------------------------------------- branch
0000004b:00000042`  184     NOT_IMM 'branch'
0000004c:06627261`  184 
0000004d:6e636800`  184 
                 `  185 _bran: ` ( -- ) add in-line literal to IP
0000004e:d1018000`  186     ady ,y+         ` IP += *(IP++) (in-lined offset)
0000004f:ae018000`  187     lde ,y+         ` NEXT aka jmp (,y+)
Most of my primitives are so short that it seems wasteful to force them all to NEXT (even a single instruction 3-cycle NEXT) but that's DTC for ya ...]
Got a kilobyte lying fallow in your 65xx's memory map? Sprinkle some VTL02C on it and see how it grows on you!

Mike B. (about me) (learning how to github)
IamRob
Posts: 357
Joined: 26 Apr 2020

Re: Using the Y-reg as the IP ptr

Post by IamRob »

Thanks. Changed the listing.

Just imagine my excitement though when I found out I can go from this:

Code: Select all

3-   LDA #$FFFD
       JMP PLUS

2-   LDA #$FFFE
       JMP PLUS

1-   LDA #$FFFF
      JMP PLUS
to this :)

Code: Select all

3-    DEA
2-    DEA
1-    DEA
      BRA NEXT
which was minutes just before I posted.
User avatar
GARTHWILSON
Forum Moderator
Posts: 8773
Joined: 30 Aug 2002
Location: Southern California
Contact:

Re: Using the Y-reg as the IP ptr

Post by GARTHWILSON »

IamRob wrote:
I want to stick with FigForth and ITC for now, and hope to hear from others doing DTC and STC.
You'll be interested in this one from the Forth section of my links page:


Here's another:



I've done a ton of optimization in my ITC Forth; but beyond that, it's so easy to mix some assembly in when you need maximum speed, especially with the 65's. I have a simple integrated assembler with normal mnemonic-operand order, not suitable for entire applications, but that's not necessary. It's great for writing primitives, runtimes, ISRs, etc.. And since it's Forth, there is automatic macro capability. I was pleasantly surprised to realize that after I wrote it.
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?
leepivonka
Posts: 167
Joined: 15 Apr 2016

Re: Using the Y-reg as the IP ptr

Post by leepivonka »

Go Rob, go!

Comparing some FIG ITC (Fig16/0265sxb) & STC (Fig16/FSub) variations & 65816F STC: viewtopic.php?f=9&t=4336&start=31
I haven't tried FIG DTC yet!
And there is token-threaded code to try for even smaller colon code running even slower.

I'm curious how you're doing with keeping TOS in A.
I messed with this a little - seems to me like it mostly reordered operations in a less intuitive way, a few things worked really well, & some got stuck doing more register save & restores.
In 65816F FORTH I sometimes keep TOS and maybe NOS in registers for a short while.

The 65816 datasheet has short description of # of cycles for each instruction, & what each cycle does for each addressing mode.
The book "Programming the 65816" has much longer descriptions.

Notes on the Fig16\0265sxb FORTH you started with:

Direct-page space on the 65816 in native mode isn't as constrained as on the 6502.

1: Fig16\0265sxb FORTH sets up a separate direct page for each user (just 1 user by default). The source you started with is set up to run on a 65C265SXB board with the built-in monitor. This monitor uses lots of direct-page memory at $000000, but coexisting there isn't a problem because FORTH sets up it's own direct page for each user (1st at $000300). This also allows moving the USER variables into direct-page & using the associated addressing modes on them. To add a 2nd user (which FIG doesn't do yet), FORTH can set up a 2nd direct page. Switching users is then just store a few CPU registers, reload the CPU's D register, load a few CPU registers. Note that keeping the direct page page-aligned is not required, but will avoid an extra cycle for each direct addressing mode use.

2: Things addressed with direct,X (like the parameter stack) don't need to fit in the 1st 256 bytes of the direct "page". A 16-bit X register in a direct,X address can index to anywhere in zero-bank ( 16-bit D + 16-bit X + 8-bit offset_in_instruction ).
0265sxb & FSub FORTH are set up to do this, but haven't gone over 256 bytes yet, even with FIG's large parameter stack.

Be careful mixing the direct-page address space & absolute address space on the 65816.
Direct-page starts in the zero bank at where the CPU D register points.
Absolute space starts in the bank where the CPU B or K register points.
The 6502 assumption that the direct page (zero page) and absolute bank both start at $000000 doesn't hold in 0265sxb & FSub FORTH since we've moved the direct-page start.
User avatar
GARTHWILSON
Forum Moderator
Posts: 8773
Joined: 30 Aug 2002
Location: Southern California
Contact:

Re: Using the Y-reg as the IP ptr

Post by GARTHWILSON »

leepivonka wrote:
I'm curious how you're doing with keeping TOS in A.
I messed with this a little - seems to me like it mostly reordered operations in a less intuitive way, a few things worked really well, & some got stuck doing more register save & restores.
I, too, have been anxious to see how that works out, what the net effect will be. Looking over my own code, I see lots of places where TOS in A would have to be stored somewhere to use A for something else, meaning there would be extra overhead offsetting some of the the gains of words like 1+ being just an INA.

As for additional direct pages: Note that the '816 doesn't wrap within direct page on DP,X and (DP,X) addressing, meaning that for Forth, you could have the data stack in the following page, leaving more of DP available for other things. For example, instead of LDA 0,X for TOS, it might be LDA $F0,X with X being initialized at $50. If DP is ZP, you'd be initializing at $140; and after a pair of DEX's, the first cell would go at $13E and $13F. The $F0 would give plenty of leeway for accessing some depth for things like ROTations or PICKs before needing more than 8 bits. NOS would be $F2,X, 3OS would be F4,X, etc.. Then, if you don't have DP very full, maybe you could put I/O there for added efficiency.
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?
IamRob
Posts: 357
Joined: 26 Apr 2020

Re: Using the Y-reg as the IP ptr

Post by IamRob »

leepivonka wrote:
I'm curious how you're doing with keeping TOS in A.
I messed with this a little - seems to me like it mostly reordered operations in a less intuitive way, a few things worked really well, & some got stuck doing more register save & restores.
In 65816F FORTH I sometimes keep TOS and maybe NOS in registers for a short while.
Again Thanks for your code, Leepivonka. It has been a tremendous help. I wouldn't have gotten as far without it.

Because of having the benefit of looking at your disassembly, I have been able to convert NULL, EXPECT, QUERY and a couple of others to primitives. And they are all smaller than their Forth counterpart. Which should also mean a good speed gain. I believe at this moment there are no Forth words that I cannot convert, although some primitives will be larger than their Forth counterpart.

Now this is where it gets REAL, really fast.
Keeping TOS in the Acc, as you said, has some things working really well. But what the Accumulator can't do, I find the Y-reg does really well.

Keep TOS in both the Accumulator and Y-reg on exit from the NEXT routine.

By doing this opens up a whole lot of opportunities to make code very small. Currently I am half way done at about 120 words, and have only had to do a PHA/PLA, 3 times, due to needing both the Acc and Y-reg at the same time.

But the majority of the routines only need one or the other, so either one can be re-used immediately and still preserve TOS in the other. This saves a ton of register saves and restores.
IamRob
Posts: 357
Joined: 26 Apr 2020

Re: Using the Y-reg as the IP ptr

Post by IamRob »

GARTHWILSON wrote:
Looking over my own code, I see lots of places where TOS in A would have to be stored somewhere to use A for something else, meaning there would be extra overhead offsetting some of the the gains of words like 1+ being just an INA.
Keep TOS also in the Y-reg. It is a lot faster and smaller to do a TYA before returning to NEXT, rather than doing a PHA/PLA.
GARTHWILSON wrote:
As for additional direct pages: Note that the '816 doesn't wrap within direct page on DP,X and (DP,X) addressing, meaning that for Forth, you could have the data stack in the following page, leaving more of DP available for other things. For example, instead of LDA 0,X for TOS, it might be LDA $F0,X with X being initialized at $50. If DP is ZP, you'd be initializing at $140; and after a pair of DEX's, the first cell would go at $13E and $13F. The $F0 would give plenty of leeway for accessing some depth for things like ROTations or PICKs before needing more than 8 bits. NOS would be $F2,X, 3OS would be F4,X, etc.. Then, if you don't have DP very full, maybe you could put I/O there for added efficiency.
Are you SH-Ting me? You just blew my mind once again with this paragraph.
I was just running out of Direct Page space, but being able to access any memory above the Direct Page, as if it is still in Direct Page, opens up a whole new can of worms.

Give me enough time and I think I can run the entire Forth system from the Direct Page. Wow!!!!!!!!!!!!
IamRob
Posts: 357
Joined: 26 Apr 2020

Re: Using the Y-reg as the IP ptr

Post by IamRob »

Everyone's knowledge of the 65816 is really coming across in these posts. And I have a feeling the surface is barely scratched.

Can anyone answer this:

Does any of the X-reg JSR/JMP's take their address from the Direct Page or are they all absolute?

For example:

if the Direct Page was set to $2000, with X-reg being zero (0), would a JSR (0000,X) or JMP (0000,X) takes the address at $0.1 or $2000.2001?

I am assuming that a normal JSR/JMP are absolute as well as JMP (????) where ???? is a zero-page $00.FF address?
User avatar
BigEd
Posts: 11463
Joined: 11 Dec 2008
Location: England
Contact:

Re: Using the Y-reg as the IP ptr

Post by BigEd »

All absolute: whether something uses Direct Page is a function of the addressing mode, not the address itself.
Post Reply