M65C02A Core
Re: M65C02A Core
Thanks - I see what you mean now. One of my very few bookmarks is the opcode table at
http://www.llx.com/~nparker/a2/opcodes.html
and I see your listings are in effect going down the columns.
Cheers
Ed
http://www.llx.com/~nparker/a2/opcodes.html
and I see your listings are in effect going down the columns.
Cheers
Ed
Re: M65C02A Core
Just posted an update to GitHUB for this core. I created a complete microcomputer targeting a Xilinx Spartan 3A XC3S200A-4VQG100I FPGA which I have used to build two development boards.
The microcomputer implementation provided utilizes all of the Block RAMs to provide 28kB of internal RAM allocated in three blocks: (1) 16kB RAM from 0x0000-3FFF; (2) 8kB ROM/RAM from 0xD000-0xEFFF; and (3) 3968 Bytes ROM/RAM from 0xF000-FEFF plus 32 Bytes from 0xFFE0-0xFFFF. (The I/O page is taken out of the topmost 128 bytes of a 4kB ROM/RAM, but the top-most 32 bytes represent an expanded interrupt/trap vector table which is mapped back into Block RAM instead of being implemented in LUTs.) The remaining 36864 bits of Block RAM (2 BRAMs) are used for a 512x72 microprogram memory array.
A rudimentary interrupt/trap vector controller is included. The normal vectors for NMI, RST, and IRQ/BRK are supported, but an additional 13 vectors are also supported. There are an additional 8 maskable interrupt vectors to support the internal peripherals. Five other vectors are also supported for additional traps: ABRT, INV, SYS, COP, and BRK. ABRT is intended to support MMU access controls in a future upgrade to the rudimentary MMU included in the released microcomputer implementation. INV is intended to allow the trapping of invalid opcodes. It is not presently connected in the current release. SYS/COP are intended to support specialized instructions in a future release of the M65C02A microprogram. These traps could be used for emulation of other instructions.
The peripherals provided in the implementation are 1 SPI Master (with support for at least two Slave Selects) and 2 UARTs. The peripherals are buffered by 16 deep Tx/Rx FIFOs. The FIFOs are parameterized so it is easy to increase the depth of any of the FIFOs as needed.
In the targeted FPGA, the large number of internal busses reduces the maximum operating speed to 30 MHz. The same M65C02A-based microcomputer targeted to a Spartan-6 XC6SLX9-3FTG256I FPGA will operate in excess of 40 MHz.
I will now focus on getting this microcomputer to run on my Arduino UNO-compatible Chameleon Board. As provided, the application uses only 53% of the logic resources of the XC3S200A-4VQG100I FPGA. This will now allow me to implement another serial port and a slave SPI port in order to make the Chameleon an intelligent slave device to Arduino-based systems.
The microcomputer implementation provided utilizes all of the Block RAMs to provide 28kB of internal RAM allocated in three blocks: (1) 16kB RAM from 0x0000-3FFF; (2) 8kB ROM/RAM from 0xD000-0xEFFF; and (3) 3968 Bytes ROM/RAM from 0xF000-FEFF plus 32 Bytes from 0xFFE0-0xFFFF. (The I/O page is taken out of the topmost 128 bytes of a 4kB ROM/RAM, but the top-most 32 bytes represent an expanded interrupt/trap vector table which is mapped back into Block RAM instead of being implemented in LUTs.) The remaining 36864 bits of Block RAM (2 BRAMs) are used for a 512x72 microprogram memory array.
A rudimentary interrupt/trap vector controller is included. The normal vectors for NMI, RST, and IRQ/BRK are supported, but an additional 13 vectors are also supported. There are an additional 8 maskable interrupt vectors to support the internal peripherals. Five other vectors are also supported for additional traps: ABRT, INV, SYS, COP, and BRK. ABRT is intended to support MMU access controls in a future upgrade to the rudimentary MMU included in the released microcomputer implementation. INV is intended to allow the trapping of invalid opcodes. It is not presently connected in the current release. SYS/COP are intended to support specialized instructions in a future release of the M65C02A microprogram. These traps could be used for emulation of other instructions.
The peripherals provided in the implementation are 1 SPI Master (with support for at least two Slave Selects) and 2 UARTs. The peripherals are buffered by 16 deep Tx/Rx FIFOs. The FIFOs are parameterized so it is easy to increase the depth of any of the FIFOs as needed.
In the targeted FPGA, the large number of internal busses reduces the maximum operating speed to 30 MHz. The same M65C02A-based microcomputer targeted to a Spartan-6 XC6SLX9-3FTG256I FPGA will operate in excess of 40 MHz.
I will now focus on getting this microcomputer to run on my Arduino UNO-compatible Chameleon Board. As provided, the application uses only 53% of the logic resources of the XC3S200A-4VQG100I FPGA. This will now allow me to implement another serial port and a slave SPI port in order to make the Chameleon an intelligent slave device to Arduino-based systems.
Michael A.
Re: M65C02A Core
Since posting an update to the M65C02A core yesterday, I've been working to add in the following instructions:
Furthermore, I decided that I would round out the JSR and JMP instructions with some of the missing addressing modes:
As part of these efforts, I realized that it would be easy to support 16-bit relative addressing. So I've made some relatively minor changes to the operand register data paths and to the relative address conditional multiplexer in the address generator. The M65C02A core can now support both 8-bit and 16-bit relative addressing. (I've posted an update to GitHUB that includes this modification.) Thus, I will implement the following two instructions to use the new 16-bit relative addressing mode just added to the core:
With these five additional instructions, there are now only 14 unused opcodes. I expect that this should be sufficient to implement a good set of DTC/ITC primitives to support FORTH or another threaded code compiler.
I would be interested in any feedback on my plans for this core. Any suggestions for the remaining 14 opcodes, as they might apply to a threaded code interpreter, would be welcome.
My first vacation in two years away from home is coming to an end this weekend.
So progress on this project will likely slow down again to just the weekends.
Football season will soon be here, and I've got season tickets once again.
Taking time off from work to go to the games last season did make the pressure at work much more tolerable.
Edit: Added missing words and a blank line after each instruction list.
RMBx/SMBx zp
BBRx/BBSx zp,rel
PEA abs
PEI zp
PER rel16
REP/SEP #imm
ORA/AND/EOR/ADC/LDA/STA/CMP/CMP sp,S
ORA/AND/EOR/ADC/LDA/STA/CMP/CMP (sp,S),Y.
COP #imm
COP (zp)
Furthermore, I decided that I would round out the JSR and JMP instructions with some of the missing addressing modes:
JMP (zp)
JSR (zp)
JSR (abs,X)
MWT zp,(Y)
MWF zp,(Y)
As part of these efforts, I realized that it would be easy to support 16-bit relative addressing. So I've made some relatively minor changes to the operand register data paths and to the relative address conditional multiplexer in the address generator. The M65C02A core can now support both 8-bit and 16-bit relative addressing. (I've posted an update to GitHUB that includes this modification.) Thus, I will implement the following two instructions to use the new 16-bit relative addressing mode just added to the core:
JMP rel16
JSR rel16
With these five additional instructions, there are now only 14 unused opcodes. I expect that this should be sufficient to implement a good set of DTC/ITC primitives to support FORTH or another threaded code compiler.
I would be interested in any feedback on my plans for this core. Any suggestions for the remaining 14 opcodes, as they might apply to a threaded code interpreter, would be welcome.
My first vacation in two years away from home is coming to an end this weekend.
Edit: Added missing words and a blank line after each instruction list.
Last edited by MichaelM on Sun Aug 03, 2014 4:52 pm, edited 1 time in total.
Michael A.
Re: M65C02A Core
Quote:
I would be interested in any feedback on my plans for this core.
Offhand my only suggestion is that perhaps you haven't taken full advantage of the Prefix idea. IOW I think you should consider defining several prefixes rather than just one. Here are some possible effects different prefixes could have on the following instruction:
- Let the role of A be assumed by X instead. This would allow powerful maneuvers for adjusting -- even scaling! -- X. (See below.)
- Let the role of A be assumed by Y instead. Ditto to above.
- Let the role of X be assumed by S instead (and let Zero-page be the stack page instead). This is a different way to achieve the new stack addressing modes. It requires no new opcodes; instead you'd just use legacy (Z-pg,X) or Z-pg,X modes but with a prefix.
- Let the role of X be assumed by Y, and let the role of Y be assumed by X. (One prefix is sufficient for both.) This would give, for example, (ind,Y) and (ind),X modes -- not to mention instructions such as JMP (ind,Y).
By way of a post-script, maybe there are a few new instructions worth introducing -- for example, add without carry. ADD would be equivalent to the sequence CLC ADC. Of course that would be handy as the first part of a multi-precision addition. What's less obvious is its use with a prefix (as noted above) to adjust X or Y. Example: instead of INX INX INX INX you'd have PFXn ADD #4.
Quote:
To support the MMU and other 16-bit IO page operations, I need two opcodes for moving 16-bit values in an indivisible manner from zero page to the IO page and vice-versa.
One way to achieve 16-bit moves would be with a "double-length accumulator" prefix. Even assuming you limit the prefix to LDA and STA operations, it'd still be very worthwhile. To be clear, PFXn LDA followed by PFXn STA would move 16 bits -- and any of the LDA/STA address modes could be used. So it could handle your MMU issue and much more.
cheers,
Jeff
In 1988 my 65C02 got six new registers and 44 new full-speed instructions!
https://laughtonelectronics.com/Arcana/ ... mmary.html
https://laughtonelectronics.com/Arcana/ ... mmary.html
Re: M65C02A Core
Jeff:
Thanks for the feedback.
You are probably right to say that I've probably not explored the potential of using prefix instruction as fully as I should. However, the potential explosion in the number of instructions that have to be tested is somewhat scary. Furthermore, since I have changed many of the the microprogram control fields from encoded to one-hot, it's not obvious at the moment how I might implement some of the register aliasing that you suggest. The one-hot control fields have reduced the combinatorial path lengths, and adding logic to induce register aliasing is likely to result in longer combinatorial control paths defeating some of the performance gains I managed to include in the latest release of the core. Thus, after I've completed the basic set of new instructions given above, I will take a longer look at how to implement some of your register aliasing suggestions.
On the other hand, I would like hear from you and others on some suggested instructions that can be used to support a DTC/ITC FORTH VM like those that you implemented for your KIM clone.
Thanks for the feedback.
You are probably right to say that I've probably not explored the potential of using prefix instruction as fully as I should. However, the potential explosion in the number of instructions that have to be tested is somewhat scary. Furthermore, since I have changed many of the the microprogram control fields from encoded to one-hot, it's not obvious at the moment how I might implement some of the register aliasing that you suggest. The one-hot control fields have reduced the combinatorial path lengths, and adding logic to induce register aliasing is likely to result in longer combinatorial control paths defeating some of the performance gains I managed to include in the latest release of the core. Thus, after I've completed the basic set of new instructions given above, I will take a longer look at how to implement some of your register aliasing suggestions.
On the other hand, I would like hear from you and others on some suggested instructions that can be used to support a DTC/ITC FORTH VM like those that you implemented for your KIM clone.
Michael A.
-
teamtempest
- Posts: 443
- Joined: 08 Nov 2009
- Location: Minnesota
- Contact:
Re: M65C02A Core
Quote:
RMBx/SMBx zp
BBRx/BBSx zp,rel
PEA abs
PEI zp
PER rel16
REP/SEP #imm
ORA/AND/EOR/ADC/LDA/STA/CMP/CMP sp,S
ORA/AND/EOR/ADC/LDA/STA/CMP/CMP (sp,S),Y.
BBRx/BBSx zp,rel
PEA abs
PEI zp
PER rel16
REP/SEP #imm
ORA/AND/EOR/ADC/LDA/STA/CMP/CMP sp,S
ORA/AND/EOR/ADC/LDA/STA/CMP/CMP (sp,S),Y.
PEA and PEI can be one mnemonic with two address modes. As with "BBRx", I imagine the only reason to keep the names is to provide backward compatiblity with existing source.
The main use of REP/SEP has always been to switch register sizes between eight and 16 bits on the '816. I don't recall if you've implemented that on your core. If dual-size registers have not, what use are these instructions? And would any of those uses be so frequent as to be preferable to PHP, adjust (using the following "S" relative instructions, perhaps), and PLP?
Quote:
JMP (zp)
JSR (zp)
JSR (abs,X)
JSR (zp)
JSR (abs,X)
Quote:
MWT zp,(Y)
MWF zp,(Y)
MWF zp,(Y)
Quote:
JMP rel16
JSR rel16
JSR rel16
Or that might actually be the defined behavior an assembler should by default follow, in which case it would be what the programmer should always expect. If there's a time penalty for using a relative form rather than an absolute form that might not be the best, however.
You might also consider using "BRA" and "BSR" for these instead, matching the "B--" of other relative branch instructions, with the provisio that these are 16-bit ranges, not eight-bit.
Re: M65C02A Core
MichaelM wrote:
On the other hand, I would like hear from you and others on some suggested instructions that can be used to support a DTC/ITC FORTH VM like those that you implemented for your KIM clone.
NEXT is basically an indirect jump via IP -- nothing terribly fancy. A Forth program is just a list of pointers, and IP indicates your current position in the list. The indirect jump in NEXT vectors execution to the 65xx code snippet that simulates the desired high-level instruction. IP needs to be incremented after the fetch, just like any program counter, so the complete definition of NEXT for 65xx is JMP (IP++). (What I've described is DTC, or direct-threaded code, used by many modern Forth implementations. Older implementations such as FIG Forth use indirect-threaded code, aka ITC, for which NEXT is defined as JMP ((IP++)).)
Unoptimized 65xx Forth maps IP as a pair of bytes in zero page, and uses legacy 65xx instructions for all accesses to IP. ITC NEXT consumes almost 40 cycles. You could entirely bypass z-pg by adding IP to the M6502A register set -- and then add NEXT to the instruction set and reap a huge speedup. Unfortunately, you'd also need new instructions for a dozen or so incidental operations involving IP, and that gets complicated.
My KimKlone uses an ambidextrous approach!
cheers,
Jeff
ps- I agree with teamtempest's observation that new instructions present a challenge in regard to mnemonics. For the KK I threw up my hands and resigned myself to a mish-mash of made-up mnemonics that are ugly & lengthy -- but descriptive.
Edit: subsequent to this post I gave the KK registers new (and hopefully less confusing) names. That's right -- register renaming! -- but not in the usual sense.
Last edited by Dr Jefyll on Sun May 24, 2015 8:37 pm, edited 1 time in total.
In 1988 my 65C02 got six new registers and 44 new full-speed instructions!
https://laughtonelectronics.com/Arcana/ ... mmary.html
https://laughtonelectronics.com/Arcana/ ... mmary.html
- BigDumbDinosaur
- Posts: 9428
- Joined: 28 May 2009
- Location: Midwestern USA (JB Pritzker’s dystopia)
- Contact:
Re: M65C02A Core
Dr Jefyll wrote:
For the KK I threw up my hands and resigned myself to a mish-mash of made-up mnemonics that are ugly & lengthy -- but descriptive.
x86? We ain't got no x86. We don't NEED no stinking x86!
Re: M65C02A Core
Yep! I warned you -- ugly & lengthy! But, despite appearances, not cryptic. Every KimKlone programmer in the world (ahem!
) is sure to know about NCP, the New Code Pointer register. LDNCPZPG,X is just LoaD NCP using ZPG,X address mode.
(It could've been simply LDNCP, but my quick 'n' dirty assembler needs to have the address mode spelled out explicitly.
)
In the case of Michael's project, hopefully there'll be a better assembler that can infer address modes in the usual way. Even so, with all the new instructions, I suspect three-letter mnemonics won't be adequate to describe the action in a way humans will readily identify. I suspect mnemonics of four letters or more will be required.
-- Jeff
(It could've been simply LDNCP, but my quick 'n' dirty assembler needs to have the address mode spelled out explicitly.
In the case of Michael's project, hopefully there'll be a better assembler that can infer address modes in the usual way. Even so, with all the new instructions, I suspect three-letter mnemonics won't be adequate to describe the action in a way humans will readily identify. I suspect mnemonics of four letters or more will be required.
-- Jeff
In 1988 my 65C02 got six new registers and 44 new full-speed instructions!
https://laughtonelectronics.com/Arcana/ ... mmary.html
https://laughtonelectronics.com/Arcana/ ... mmary.html
- barrym95838
- Posts: 2056
- Joined: 30 Jun 2013
- Location: Sacramento, CA, USA
Re: M65C02A Core
I agree whole-heartedly with teamtempest's bra/bsr recommendation. Or, would it be brl and bsl?
I agree with Dr. J that a thoughtful treatment of the NEXT mechanism is where the best gains can be made, even if it turns out to be little more than an efficient 16-bit memory increment by one or two. If you have a 16-bit pei (for DOCOLON and >R), then that's even better.
I don't agree with his 4+ char mnemonic idea, though. It just wouldn't feel 65xx enough for me.
Mike
P.S. How hard would auto-increment ( a la 6809 ) be to incorporate?
I agree with Dr. J that a thoughtful treatment of the NEXT mechanism is where the best gains can be made, even if it turns out to be little more than an efficient 16-bit memory increment by one or two. If you have a 16-bit pei (for DOCOLON and >R), then that's even better.
I don't agree with his 4+ char mnemonic idea, though. It just wouldn't feel 65xx enough for me.
Mike
P.S. How hard would auto-increment ( a la 6809 ) be to incorporate?
Re: M65C02A Core
Thanks for your responses and suggestions.
I too realized that it would be difficult to distinguish between a JMP/JSR abs and a JMP/JSR rel16. So I wholeheartedly agree that BRA/BSR are much better mnemonics for these two opcodes. I think that it should be easy to resolve whether the branch target is within an 8-bit or a 16-bit range and select the appropriate opcode. Therefore, I don't think it is necessary to use a mnemonic like BRL instead of BRA. (BTW, it may be possible to modify the conditional branch instructions with the prefix instruction to implement a 16-bit relative branch.)
I don't think that it will be too difficult to implement a NEXT a la PDP-11/MC6809 with an auto-increment of the virtual IP.
I was thinking that I would implement NEXT as a single byte instruction. However, as Jeff pointed out above, NEXT is a jump indirect via IP (Intrepretive/Instruction Pointer) with auto-increment. I have been thinking about Jeff's suggestion, and will implement the instruction he suggested with the IP in zero page: JMP (zp++). (Thanks very much Jeff.
) That instruction also suggests using zero page for implementing the other FORTH VM registers: W (Working Register), and PSP (Parameter Stack Pointer). The M65C02A page 1 stack can be used for the RS (Return Stack), and the PEI/PEA/PER instructions and stack relative addressing modes can be used for manipulating the return stack.
Can someone comment on whether the RS and PS (Parameter Stack) is best implemented in the 6502 processor stack or not?
These (Y) notation was intended to indicate that these two instructions are two address instructions in contrast to all other instructions. The first address is provided by the zp operand and the second address is the contents of register Y. The contents of register Y will index the IO page, which in the M65C02A is the 256 byte page 0xFF00:FFFF. Perhaps a notation closer to that used for the stack relative instructions might be clearer, but the generally accepted single address/single operand syntax of the 6502 makes it difficult to convey the two address nature of these two instructions.
You, BDD, and others have suggested changing the mnemonics for PEI and PEA on another thread. I don't disagree with the points that you have made. I only want the results. I think it has been suggested that these instructions be defined as:
I am not sure that PER would serve much purpose if BRA/BSR rel16 were available unless it was also possible to perform these two operations based on the top two locations of the stack. Therefore, what would you say if REP/SEP #imm were not implement as you suggested and instead BRA/BSR (sp,S) were implemented?
I like the idea of implementing JMP/JSR (zp), but I see your point regarding the extra cycle: it's really not that critical in the overall scheme. Thus, I will not implement those two instructions, and reserve the opcodes for other instructions.
I can see implementing some instructions which are the complements of the PHW instructions:
I too realized that it would be difficult to distinguish between a JMP/JSR abs and a JMP/JSR rel16. So I wholeheartedly agree that BRA/BSR are much better mnemonics for these two opcodes. I think that it should be easy to resolve whether the branch target is within an 8-bit or a 16-bit range and select the appropriate opcode. Therefore, I don't think it is necessary to use a mnemonic like BRL instead of BRA. (BTW, it may be possible to modify the conditional branch instructions with the prefix instruction to implement a 16-bit relative branch.)
barrym95838 wrote:
P.S. How hard would auto-increment ( a la 6809 ) be to incorporate?
I was thinking that I would implement NEXT as a single byte instruction. However, as Jeff pointed out above, NEXT is a jump indirect via IP (Intrepretive/Instruction Pointer) with auto-increment. I have been thinking about Jeff's suggestion, and will implement the instruction he suggested with the IP in zero page: JMP (zp++). (Thanks very much Jeff.
Can someone comment on whether the RS and PS (Parameter Stack) is best implemented in the 6502 processor stack or not?
teamtempest wrote:
I'm having trouble visualizing exactly what is meant by "(Y)". Even if it does mean something special, aren't the mnemonics themselves enough of a clue? Particularly since, AFAICT, no other instruction would use a "(Y)" mode.
You, BDD, and others have suggested changing the mnemonics for PEI and PEA on another thread. I don't disagree with the points that you have made. I only want the results. I think it has been suggested that these instructions be defined as:
PHW #imm16
PHW dp
I am not sure that PER would serve much purpose if BRA/BSR rel16 were available unless it was also possible to perform these two operations based on the top two locations of the stack. Therefore, what would you say if REP/SEP #imm were not implement as you suggested and instead BRA/BSR (sp,S) were implemented?
I like the idea of implementing JMP/JSR (zp), but I see your point regarding the extra cycle: it's really not that critical in the overall scheme. Thus, I will not implement those two instructions, and reserve the opcodes for other instructions.
I can see implementing some instructions which are the complements of the PHW instructions:
PLW zp
PLW abs
Michael A.
- GARTHWILSON
- Forum Moderator
- Posts: 8775
- Joined: 30 Aug 2002
- Location: Southern California
- Contact:
Re: M65C02A Core
MichaelM wrote:
Can someone comment on whether the RS and PS (Parameter Stack) is best implemented in the 6502 processor stack or not?
As I originally learned it nearly 25 years ago, "W" in Forth is the word pointer, and "IP" was the instruction pointer (since compiled code does not get interpreted).
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?
- BigDumbDinosaur
- Posts: 9428
- Joined: 28 May 2009
- Location: Midwestern USA (JB Pritzker’s dystopia)
- Contact:
Re: M65C02A Core
MichaelM wrote:
You, BDD, and others have suggested changing the mnemonics for PEI and PEA on another thread. I don't disagree with the points that you have made. I only want the results. I think it has been suggested that these instructions be defined as:
PHW #imm16
PHW dp
Quote:
I am not sure that PER would serve much purpose if BRA/BSR rel16 were available unless it was also possible to perform these two operations based on the top two locations of the stack.
x86? We ain't got no x86. We don't NEED no stinking x86!
- barrym95838
- Posts: 2056
- Joined: 30 Jun 2013
- Location: Sacramento, CA, USA
Re: M65C02A Core
MichaelM wrote:
... I was thinking that I would implement NEXT as a single byte instruction. However, as Jeff pointed out above, NEXT is a jump indirect via IP (Intrepretive/Instruction Pointer) with auto-increment. I have been thinking about Jeff's suggestion, and will implement the instruction he suggested with the IP in zero page: JMP (zp++). (Thanks very much Jeff.
) That instruction also suggests using zero page for implementing the other FORTH VM registers: W (Working Register), and PSP (Parameter Stack Pointer). The M65C02A page 1 stack can be used for the RS (Return Stack), and the PEI/PEA/PER instructions and stack relative addressing modes can be used for manipulating the return stack.
Quote:
Can someone comment on whether the RS and PS (Parameter Stack) is best implemented in the 6502 processor stack or not?
I'm excited that you have been able to make such significant progress on your project, and will be (a bit enviously) following it with interest.
Mike
Re: M65C02A Core
MichaelM wrote:
I have been thinking about Jeff's suggestion, and will implement the instruction he suggested with the IP in zero page: JMP (zp++).
Of course it's fair to ask whether it'd be better to have something more general. (Mike asks almost the same question in his last post: is it an address mode, or is it special behavior of a designated area of memory?) Without wishing to sway you one way or the other, here's an observation to consider. If the Interpretive Pointer is physically on-chip, there's no need to consume 2 extra bus cycles fetching those 16 bits of IP -- a obvious point in favor of IP being on-chip. But if the JMP (zp++) instruction uses an operand to specify where IP resides, then saving those 2 extra bus cycles means you must map all of zero-page on-chip. At stake is a delay of (maybe) up to 3 cycles altogether.
MichaelM wrote:
Can someone comment on whether the RS and PS (Parameter Stack) is best implemented in the 6502 processor stack or not?
X is actually sub-optimal as a P-stack pointer in that the sequence DEX then STA 0,X takes twice as long as PHA, for example. We do a lot of pushing (and pulling) in Forth, so the matter isn't trivial. But X's poor push/pull performance is outweighed by the immense utility of z-pg,X and (z-pg,X) address modes. So, for 6502 and 65c02, the justification for X as a P-stack pointer is clear.
The insight is this. Now that sp,S and (sp,S),Y address modes are available (on the '816 and Michael's M65C02A), X no longer has vastly greater utility than S -- in fact, (sp,S),Y surpasses X in a manner that Forth can use to good advantage. The longstanding 6502 tradeoff (tolerating slow P-stack push/pulls via X) is now clearly open to review. I don't advocate that anyone should rewrite an existing Forth. But IMO any new '816 Forth should break with tradition and use S for the P-stack pointer and X for the R-stack pointer!
Is there a gotcha I've overlooked? Am I reinventing someone else's idea? (And am I de-railing Michael's thread?
cheers,
Jeff
In 1988 my 65C02 got six new registers and 44 new full-speed instructions!
https://laughtonelectronics.com/Arcana/ ... mmary.html
https://laughtonelectronics.com/Arcana/ ... mmary.html