I have not actively joined in the conversation on this topic, but I have been following the discussion.
I have implemented an enhanced version of the 6502/65C02 which I think is in keeping with the general philosophy of the 6502. However, I have taken liberties to add certain features that I think are important to the support of many High Level Languages (HLLs) such as C and Pascal. In addition, following some friendly prodding by Dr. Jefyll, I've included built-in support for a FORTH VM.
During my time actively working on the core, I've added dedicated instructions for stack-relative addressing, base-pointer addressing, and FORTH IP relative addressing with auto increment. I also included support for kernel and user mode stack pointers, an auxiliary (third) stack using X, and various prefix instructions for increasing the size of ALU operands and operations, adding indirection, and a combination of both indirection and size. In addition, I added prefix codes to override the accumulator with either X or Y.
In the final implementation of the
M65C02A, I've eliminated specific base-relative instructions, and decided to use a 1-offset base relative addressing mode using the built-in pre-indexed (X) instructions. In other words, I gave up the need to use a 0-offset base relative addressing mode using dedicated opcodes; I came to appreciate the need to preserve compatibility with the modern WDC W65C02S instruction set.
In the process of porting the Mak Pascal compiler, I realized that I could implement a stack-relative addressing mode using the prefix instruction I had implemented to override the default stack pointer. When applied to the pre-indexed (X) addressing mode instructions, there is a full complement of instructions using both 8-bit and 16-bit offsets greater than the number I was able to support with the free opcodes of the W65C02S microprocessor. The only real limitation for me was that using S as the index/base register, an offset of 1 is needed instead of 0 to access the top of stack element. This is really an aesthetic personal preference; the code generator of a compiler really could care less if the offset for the top of stack or the stack frame is 0 or 1.
I have finished the development of the M65C02A core, although I've not fully tested all of the changes that I made. I frequently review and update the documentation that I've placed in the
README file in the project's github directory. I also update the core's
user manual when I have some free time. I've not updated the Pascal compiler, also found on github. in a few months because I'm working on completing another one of my processor projects. In particular, I've not updated the compiler to support the 1-offset format for base-relative and stack-relative addressing modes or the need to include the default override prefix instruction to the pre-indexed addressing mode instruction to derive the stack-relative addressing mode.
In the development of the M65C02A I had as an objective backward compatibility with the 6502/65C02. Other than my desire to remove all dead cycles from the instruction set (see the 65CE02), the basic implementation is fairly true to the 6502/65C02. (Some minor behavioral differences in the BRK, RTS, and RTI instructions should have no consequences on new developments, but may affect existing code. I had to make some minor changes to Daryll's monitor program.) I found that moving back and forth between extended/enhanced and normal operation was best handled by prefix instructions rather than a mode register. Unlike the 65816, which was not a consideration in the M65C02A project, the prefix bytes allow easy enhancement to 16-bit operation and easy return to 8-bit operation.
In the case of the Pascal compiler, if the enhanced 16-bit operations were to be the norm, the default size could easily be set to be 16 bits, and the size prefix could be used to change to 8 bits. This would improve the performance of the core for those instances where the Pascal compiler was used more often than not. Another alternative, which I think I've included in the released core, is to tie this capability to the mapping logic provided by an MMU. For certain address ranges the default size is 16 bits, and in other address ranges the default size is 8 bits.
I certainly enjoy the free exchange of ideas for improving the 6502 that characterizes many of the threads on enhancing the 6502 architecture. I've implemented some of the ideas offered up by many of the longtime members of the forum. As you can glean from my discussion above, I too have had a number of false starts. The trade-offs between the various enhancements and the basic architecture/flavor of the 6502 has been instructive personally, and given me a greater appreciation for the 6502, 6800/6801/68HC11, 8080/8085/Z80 microprocessors.
For me the 6502 has offered the best path toward an enhanced 16-bit option. The others have essentially filled opcode spaces, which makes the development of enhancements more difficult. The wide range of addressing modes supported by the 6502 also made the inclusion of base-relative and stack-relative fairly easy and useful as I found out when mapping the M65C02A onto the 8086 register set to port the Mak Pascal compiler.
My experience in developing the M65C02A core has been enhanced by trying out the ideas. I wanted the core to support stack frames and easily support a HLL like Pascal or C. I first started out by simply adding instructions that I thought might be useful for a compiler. But when I actually ported the Pascal compiler and mapped the instructions used, I found out for myself that what many sources claim regarding the utility of instructions with complex addressing modes or functions in compiled code is true.
I ran a histogram on the number of instructions that the compiler made use of for several different applications, and many of the instructions that I painstakingly added to the M65C02A were unused. I did find that base-relative and stack-relative addressing were extensively used. I also found that I could only effectively use just a fraction of the basic instructions and addressing modes of the 6502/65C02; the operations supported by the compiler and the virtual machine targeted by the compiler just do not make use of zero page or the many addressing modes into zero page. Thus, I reversed many of the "enhanced" instructions I had added and decided to simply use the default stack pointer prefix instruction to enable stack-relative addressing using the existing pre-indexed addressing mode.
The microprogram space freed up by this decision allowed me to make indirection work in more natural way. Previous to this final change, the prefix instructions adding indirection applied indirection prior to indexing. This meant that adding indirection to any pre-indexed addressing mode converted the operation to a post-indexed single or double indirect addressing mode. With the microprogram space recovered by eliminating dedicated base-relative and stack-relative instructions, I was able to apply single or double indirection after indexing by X or before indexing by Y. I think that this is more in keeping with the base architecture. The additional microprogram space also allowed me to implement more support for relative addressing mode: I increased the range for the branches from 8 bits to 16 bits and I added pc-relative subroutine instruction.