Using the Y-reg as the IP ptr
- barrym95838
- Posts: 2056
- Joined: 30 Jun 2013
- Location: Sacramento, CA, USA
Re: Using the Y-reg as the IP ptr
Go Rob, go! And please share along the way.
Charlie keeps his PETTIL headers separate from his code in his DTC Forth, but I haven't thoroughly examined the pros and cons of that strategy.
Charlie keeps his PETTIL headers separate from his code in his DTC Forth, but I haven't thoroughly examined the pros and cons of that strategy.
Got a kilobyte lying fallow in your 65xx's memory map? Sprinkle some VTL02C on it and see how it grows on you!
Mike B. (about me) (learning how to github)
Mike B. (about me) (learning how to github)
- GARTHWILSON
- Forum Moderator
- Posts: 8773
- Joined: 30 Aug 2002
- Location: Southern California
- Contact:
Re: Using the Y-reg as the IP ptr
Should the LDY $C0DE be LDY #$C0DE? Even if you put this in direct page, it's still no fewer cycles than mine, but if one of the registers were available, you could speed it up by replacing the pair of INC NEXT+1's (14 cycles) with LD_, IN_, IN_, ST_ (12 cycles), where the _ gets replaced with A, X, or Y.
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?
- barrym95838
- Posts: 2056
- Joined: 30 Jun 2013
- Location: Sacramento, CA, USA
Re: Using the Y-reg as the IP ptr
The $C0DE vs. #$C0DE is the thing that confuses me the most, and I always seem to pick the wrong version first when I'm in the flow. I don't have time now, but I'll brain-simulate it later and figure it out, if no one else beats me to it.
Sometimes that little # or a set of parentheses is the only fundamental difference between ITC and DTC.
Sometimes that little # or a set of parentheses is the only fundamental difference between ITC and DTC.
Got a kilobyte lying fallow in your 65xx's memory map? Sprinkle some VTL02C on it and see how it grows on you!
Mike B. (about me) (learning how to github)
Mike B. (about me) (learning how to github)
Re: Using the Y-reg as the IP ptr
GARTHWILSON wrote:
Should the LDY $C0DE be LDY #$C0DE? Even if you put this in direct page, it's still no fewer cycles than mine, but if one of the registers were available, you could speed it up by replacing the pair of INC NEXT+1's (14 cycles) with LD_, IN_, IN_, ST_ (12 cycles), where the _ gets replaced with A, X, or Y.
Although the pair of INC NEXT's are 2 cycles shorter, they are also 2 bytes larger. And my Diretct page is filling up fast. I have replaced 7x JMP NEXT with BRA NEXT, and hope to save more by putting more routines into the Direct Page, so for now need all the space I can get.
Re: Using the Y-reg as the IP ptr
barrym95838 wrote:
Go Rob, go! And please share along the way.
Charlie keeps his PETTIL headers separate from his code in his DTC Forth, but I haven't thoroughly examined the pros and cons of that strategy.
Charlie keeps his PETTIL headers separate from his code in his DTC Forth, but I haven't thoroughly examined the pros and cons of that strategy.
One of the biggest advantages of keeping the TOS in both the Acc and Y-reg when entering a word definition is so you can go either way and just re-use the one that is not needed. Check out doFETCH and doSWAP.
This is the code going into my Direct Page.
Code: Select all
doFETCH LDA $0000,Y
BRA NEXT
doSWAP LDA $0,X
STY $0,X
BRA NEXT
3+ INA
2+ INA
1+ INA
BRA NEXT
3- DEA
2- DEA
1- DEA
BRA NEXT
doCOLON LDA NEXT+1
PHA
LDA (NEXT+1)
STA NEXT+1
TYA ; this is another advantage of having TOS in both Acc and Y-reg. You don't have to do PHA/PLA
BRA NEXT
doSEMIS PLY
STY NEXT+1
BUMP INC NEXT+1
INC NEXT+1
NEXT LDY #0000 ; IP pointer
STY W+1 ; I like to use registers up as quick as possible to show they are immediately available
INC NEXT+1
INC NEXT+1
TAY
W JMP ($0000)
Last edited by IamRob on Wed Dec 29, 2021 5:52 am, edited 3 times in total.
- GARTHWILSON
- Forum Moderator
- Posts: 8773
- Joined: 30 Aug 2002
- Location: Southern California
- Contact:
Re: Using the Y-reg as the IP ptr
Quote:
Although the pair of INC NEXT's are 2 cycles shorter,
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?
- barrym95838
- Posts: 2056
- Joined: 30 Jun 2013
- Location: Sacramento, CA, USA
Re: Using the Y-reg as the IP ptr
IamRob wrote:
Code: Select all
...
1- DEA
2- DEA
3- DEA
BRA NEXT
doCOLON LDY NEXT+1
PHY
LDA (NEXT+1)
STA NEXT+1
BRA NEXT
...[Gratuitous vapor-ware example from my 65m32a DTC Forth: a is TOS, s is DSP, x is RSP, y is IP, e is PC:
Code: Select all
` 169 `--------------------------------------------------------------- enter
` 170 enter_: ` ( R: -- a ) enter a new thread -- called with "jsr enter_"
00000040:9102c000` 171 sly ,-x ` push old IP on R: and pop new IP (return addr
00000041:ae018000` 172 lde ,y+ ` from caller's "jsr enter_" instruction)
` 173 `---------------------------------------------------------------- EXIT
00000042:00000000` 174 NOT_IMM 'EXIT'
00000043:04455849` 174
00000044:54000000` 174
` 175 _exit: ` ( R: a -- ) exit the current high-level thread
00000045:a1028000` 176 ldy ,x+ ` pop IP from R:
00000046:ae018000` 177 lde ,y+ ` NEXT aka jmp (,y+)
` 178 `----------------------------------------------------------------- lit
00000047:00000042` 179 NOT_IMM 'lit'
00000048:036c6974` 179
` 180 _lit: ` ( -- x ) push inline literal -- primitive compiled by LITERAL
00000049:ba018000` 181 pda ,y+ ` push in-lined literal on S: and bump IP
0000004a:ae018000` 182 lde ,y+ ` NEXT aka jmp (,y+)
` 183 `-------------------------------------------------------------- branch
0000004b:00000042` 184 NOT_IMM 'branch'
0000004c:06627261` 184
0000004d:6e636800` 184
` 185 _bran: ` ( -- ) add in-line literal to IP
0000004e:d1018000` 186 ady ,y+ ` IP += *(IP++) (in-lined offset)
0000004f:ae018000` 187 lde ,y+ ` NEXT aka jmp (,y+)Got a kilobyte lying fallow in your 65xx's memory map? Sprinkle some VTL02C on it and see how it grows on you!
Mike B. (about me) (learning how to github)
Mike B. (about me) (learning how to github)
Re: Using the Y-reg as the IP ptr
Thanks. Changed the listing.
Just imagine my excitement though when I found out I can go from this:
to this
which was minutes just before I posted.
Just imagine my excitement though when I found out I can go from this:
Code: Select all
3- LDA #$FFFD
JMP PLUS
2- LDA #$FFFE
JMP PLUS
1- LDA #$FFFF
JMP PLUSCode: Select all
3- DEA
2- DEA
1- DEA
BRA NEXT- GARTHWILSON
- Forum Moderator
- Posts: 8773
- Joined: 30 Aug 2002
- Location: Southern California
- Contact:
Re: Using the Y-reg as the IP ptr
IamRob wrote:
I want to stick with FigForth and ITC for now, and hope to hear from others doing DTC and STC.
- Bruce Clark explains how the faster-running STC Forth avoids the expected memory penalties. He gives 9 reasons, starting in the middle of his long post in the middle of the page. STC of course eliminates the need for NEXT, nest, and unnest, thus improving speed.
Here's another:
- KimKlone 65c02 with pointer-arithmetic-friendly extended address space and 9-cycle ITC Forth NEXT, by Jeff Laughton. It gives 6 new registers and 44 new instructions.
I've done a ton of optimization in my ITC Forth; but beyond that, it's so easy to mix some assembly in when you need maximum speed, especially with the 65's. I have a simple integrated assembler with normal mnemonic-operand order, not suitable for entire applications, but that's not necessary. It's great for writing primitives, runtimes, ISRs, etc.. And since it's Forth, there is automatic macro capability. I was pleasantly surprised to realize that after I wrote it.
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?
-
leepivonka
- Posts: 167
- Joined: 15 Apr 2016
Re: Using the Y-reg as the IP ptr
Go Rob, go!
Comparing some FIG ITC (Fig16/0265sxb) & STC (Fig16/FSub) variations & 65816F STC: viewtopic.php?f=9&t=4336&start=31
I haven't tried FIG DTC yet!
And there is token-threaded code to try for even smaller colon code running even slower.
I'm curious how you're doing with keeping TOS in A.
I messed with this a little - seems to me like it mostly reordered operations in a less intuitive way, a few things worked really well, & some got stuck doing more register save & restores.
In 65816F FORTH I sometimes keep TOS and maybe NOS in registers for a short while.
The 65816 datasheet has short description of # of cycles for each instruction, & what each cycle does for each addressing mode.
The book "Programming the 65816" has much longer descriptions.
Notes on the Fig16\0265sxb FORTH you started with:
Direct-page space on the 65816 in native mode isn't as constrained as on the 6502.
1: Fig16\0265sxb FORTH sets up a separate direct page for each user (just 1 user by default). The source you started with is set up to run on a 65C265SXB board with the built-in monitor. This monitor uses lots of direct-page memory at $000000, but coexisting there isn't a problem because FORTH sets up it's own direct page for each user (1st at $000300). This also allows moving the USER variables into direct-page & using the associated addressing modes on them. To add a 2nd user (which FIG doesn't do yet), FORTH can set up a 2nd direct page. Switching users is then just store a few CPU registers, reload the CPU's D register, load a few CPU registers. Note that keeping the direct page page-aligned is not required, but will avoid an extra cycle for each direct addressing mode use.
2: Things addressed with direct,X (like the parameter stack) don't need to fit in the 1st 256 bytes of the direct "page". A 16-bit X register in a direct,X address can index to anywhere in zero-bank ( 16-bit D + 16-bit X + 8-bit offset_in_instruction ).
0265sxb & FSub FORTH are set up to do this, but haven't gone over 256 bytes yet, even with FIG's large parameter stack.
Be careful mixing the direct-page address space & absolute address space on the 65816.
Direct-page starts in the zero bank at where the CPU D register points.
Absolute space starts in the bank where the CPU B or K register points.
The 6502 assumption that the direct page (zero page) and absolute bank both start at $000000 doesn't hold in 0265sxb & FSub FORTH since we've moved the direct-page start.
Comparing some FIG ITC (Fig16/0265sxb) & STC (Fig16/FSub) variations & 65816F STC: viewtopic.php?f=9&t=4336&start=31
I haven't tried FIG DTC yet!
And there is token-threaded code to try for even smaller colon code running even slower.
I'm curious how you're doing with keeping TOS in A.
I messed with this a little - seems to me like it mostly reordered operations in a less intuitive way, a few things worked really well, & some got stuck doing more register save & restores.
In 65816F FORTH I sometimes keep TOS and maybe NOS in registers for a short while.
The 65816 datasheet has short description of # of cycles for each instruction, & what each cycle does for each addressing mode.
The book "Programming the 65816" has much longer descriptions.
Notes on the Fig16\0265sxb FORTH you started with:
Direct-page space on the 65816 in native mode isn't as constrained as on the 6502.
1: Fig16\0265sxb FORTH sets up a separate direct page for each user (just 1 user by default). The source you started with is set up to run on a 65C265SXB board with the built-in monitor. This monitor uses lots of direct-page memory at $000000, but coexisting there isn't a problem because FORTH sets up it's own direct page for each user (1st at $000300). This also allows moving the USER variables into direct-page & using the associated addressing modes on them. To add a 2nd user (which FIG doesn't do yet), FORTH can set up a 2nd direct page. Switching users is then just store a few CPU registers, reload the CPU's D register, load a few CPU registers. Note that keeping the direct page page-aligned is not required, but will avoid an extra cycle for each direct addressing mode use.
2: Things addressed with direct,X (like the parameter stack) don't need to fit in the 1st 256 bytes of the direct "page". A 16-bit X register in a direct,X address can index to anywhere in zero-bank ( 16-bit D + 16-bit X + 8-bit offset_in_instruction ).
0265sxb & FSub FORTH are set up to do this, but haven't gone over 256 bytes yet, even with FIG's large parameter stack.
Be careful mixing the direct-page address space & absolute address space on the 65816.
Direct-page starts in the zero bank at where the CPU D register points.
Absolute space starts in the bank where the CPU B or K register points.
The 6502 assumption that the direct page (zero page) and absolute bank both start at $000000 doesn't hold in 0265sxb & FSub FORTH since we've moved the direct-page start.
- GARTHWILSON
- Forum Moderator
- Posts: 8773
- Joined: 30 Aug 2002
- Location: Southern California
- Contact:
Re: Using the Y-reg as the IP ptr
leepivonka wrote:
I'm curious how you're doing with keeping TOS in A.
I messed with this a little - seems to me like it mostly reordered operations in a less intuitive way, a few things worked really well, & some got stuck doing more register save & restores.
I messed with this a little - seems to me like it mostly reordered operations in a less intuitive way, a few things worked really well, & some got stuck doing more register save & restores.
As for additional direct pages: Note that the '816 doesn't wrap within direct page on DP,X and (DP,X) addressing, meaning that for Forth, you could have the data stack in the following page, leaving more of DP available for other things. For example, instead of LDA 0,X for TOS, it might be LDA $F0,X with X being initialized at $50. If DP is ZP, you'd be initializing at $140; and after a pair of DEX's, the first cell would go at $13E and $13F. The $F0 would give plenty of leeway for accessing some depth for things like ROTations or PICKs before needing more than 8 bits. NOS would be $F2,X, 3OS would be F4,X, etc.. Then, if you don't have DP very full, maybe you could put I/O there for added efficiency.
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?
Re: Using the Y-reg as the IP ptr
leepivonka wrote:
I'm curious how you're doing with keeping TOS in A.
I messed with this a little - seems to me like it mostly reordered operations in a less intuitive way, a few things worked really well, & some got stuck doing more register save & restores.
In 65816F FORTH I sometimes keep TOS and maybe NOS in registers for a short while.
I messed with this a little - seems to me like it mostly reordered operations in a less intuitive way, a few things worked really well, & some got stuck doing more register save & restores.
In 65816F FORTH I sometimes keep TOS and maybe NOS in registers for a short while.
Because of having the benefit of looking at your disassembly, I have been able to convert NULL, EXPECT, QUERY and a couple of others to primitives. And they are all smaller than their Forth counterpart. Which should also mean a good speed gain. I believe at this moment there are no Forth words that I cannot convert, although some primitives will be larger than their Forth counterpart.
Now this is where it gets REAL, really fast.
Keeping TOS in the Acc, as you said, has some things working really well. But what the Accumulator can't do, I find the Y-reg does really well.
Keep TOS in both the Accumulator and Y-reg on exit from the NEXT routine.
By doing this opens up a whole lot of opportunities to make code very small. Currently I am half way done at about 120 words, and have only had to do a PHA/PLA, 3 times, due to needing both the Acc and Y-reg at the same time.
But the majority of the routines only need one or the other, so either one can be re-used immediately and still preserve TOS in the other. This saves a ton of register saves and restores.
Re: Using the Y-reg as the IP ptr
GARTHWILSON wrote:
Looking over my own code, I see lots of places where TOS in A would have to be stored somewhere to use A for something else, meaning there would be extra overhead offsetting some of the the gains of words like 1+ being just an INA.
GARTHWILSON wrote:
As for additional direct pages: Note that the '816 doesn't wrap within direct page on DP,X and (DP,X) addressing, meaning that for Forth, you could have the data stack in the following page, leaving more of DP available for other things. For example, instead of LDA 0,X for TOS, it might be LDA $F0,X with X being initialized at $50. If DP is ZP, you'd be initializing at $140; and after a pair of DEX's, the first cell would go at $13E and $13F. The $F0 would give plenty of leeway for accessing some depth for things like ROTations or PICKs before needing more than 8 bits. NOS would be $F2,X, 3OS would be F4,X, etc.. Then, if you don't have DP very full, maybe you could put I/O there for added efficiency.
I was just running out of Direct Page space, but being able to access any memory above the Direct Page, as if it is still in Direct Page, opens up a whole new can of worms.
Give me enough time and I think I can run the entire Forth system from the Direct Page. Wow!!!!!!!!!!!!
Re: Using the Y-reg as the IP ptr
Everyone's knowledge of the 65816 is really coming across in these posts. And I have a feeling the surface is barely scratched.
Can anyone answer this:
Does any of the X-reg JSR/JMP's take their address from the Direct Page or are they all absolute?
For example:
if the Direct Page was set to $2000, with X-reg being zero (0), would a JSR (0000,X) or JMP (0000,X) takes the address at $0.1 or $2000.2001?
I am assuming that a normal JSR/JMP are absolute as well as JMP (????) where ???? is a zero-page $00.FF address?
Can anyone answer this:
Does any of the X-reg JSR/JMP's take their address from the Direct Page or are they all absolute?
For example:
if the Direct Page was set to $2000, with X-reg being zero (0), would a JSR (0000,X) or JMP (0000,X) takes the address at $0.1 or $2000.2001?
I am assuming that a normal JSR/JMP are absolute as well as JMP (????) where ???? is a zero-page $00.FF address?
Re: Using the Y-reg as the IP ptr
All absolute: whether something uses Direct Page is a function of the addressing mode, not the address itself.