BigDumbDinosaur wrote:
Hugh Aguilar wrote:
Quote:
I'm not opposed to the 65c816 if it could be upgraded with an OSX prebyte like on the M65c02A --- without that, Forth is too bloaty and slow to bother with.
You keep making this assertion, despite no evidence to support it. Can you provide a link to a Forth that is bloaty and slow running on a 65C816? I'm aware of Garth's 65802 Forth, which he has described as running considerably faster than the 6502 equivalent.
I've seen some example code from Garth's 65c816 Forth, and it would likely be an order of magnitude slower than C code --- that is too slow to bother with --- he is using ITC (indirect-threaded-code) that has the advantage of being very compact (in the 1970s when Charles Moore invented ITC, a micro-controller might have only a 2KB EPROM), but it is not appropriate for the 65c816 that likely has 64KB of code memory and 64KB or more of RAM. STC is a lot more efficient, especially with peephole-optimization, but the 65c816 needs better support for using the X register as a data-stack pointer.
Speed is not necessarily required for micro-controller applications, so Garth's ITC Forth could still be useful. By many accounts, the most important feature of Forth is interactive development. You can execute code from the command-line and get immediate response. You can write short test functions on the command-line and execute them immediately. This is very useful! In most cases, a board doesn't do what the electrical engineer says that it does. This is because the hardware design is bad, the electrical technician built a bad board, or because of environmental issues such as EM pollution. From the command-line, it is possible to test the board and find out what it does. This is done
before the program is written. Also, I have seen many cases in which the electrical technician has a test suite of Forth functions that he or she uses to test the boards after building them --- he or she is not actually a Forth programmer, but just knows enough about Forth to use the test functions on the command-line (there are actually more women than men doing this, probably because they have a lighter touch on the soldering iron than your typical ham-handed man).
Anyway, it is a straw-man argument to say that a 65c816 Forth is faster than a 65c02 Forth, because I never said otherwise. Obviously, if you have 16-bit registers then that is going to be faster than 8-bit registers.
My point is that the 65c816 is not fast enough. It is a bad design. William Mensch was opposed to using prebytes, such as are used on the MC6809, but the result was that his ISA is too redundant and has too many missing features.
If you have an OSX prebyte to switch S and X in the next instruction, then you can get rid of the offset,S addressing-mode because it is redundant to the offset,X addressing-mode. You can also get rid of the (offset,S),Y addressing-mode and replace it with a (offset,X),Y addressing-mode that could be used for S instead of X with an OSX prebyte (this would not only support Forth, but would support arrays of pointers to structs in zero-page).
Also, if you have an OAY prebyte to switch A and Y in the next instruction, then all of the instructions (ADC, EOR, etc.) that use A and a memory value can use Y and a memory value also.
If he didn't have so much redundancy in the ISA, he would have had more room in the 256-opcode space to add more instructions. For example, he could have added a Y addressing-mode so the instructions that use A and a memory value could use A and Y instead. This would be a kind of inherent addressing-mode as there would be no operand, and it would be quite fast --- normally the result would end up in A, but given the OAY prebyte the result could end up in Y.
I liked the 65c02. I wrote a Forth cross-compiler that ran under MS-DOS UR/Forth and generated code for the Apple-IIc (I actually had a Laser-128). This was compatible with ISYS Forth that ran on the Apple-IIc. This generated faster code than any C or Pascal compiler of the day --- I doubt that it would generate faster code than Walter Bank's 65c02 C compiler --- but for its time period, it was quite efficient.
I used my cross-compiler to write a program to do symbolic math. My program could do derivatives of equations and then simplify the results. My program could also plot functions on the graphics screen and display equations using math symbols and Greek letters. My goal was to do integrals, but I found that to be beyond the capabilities of the 65c02 --- I had really pushed the capabilities of the 65c02 quite a lot already.
The 6502 does have some bad features (for example, INC and DEC should set the CF the same as ADC or SBC do). For the most part though, the 6502 was a good design --- the (zp),Y addressing-mode was pretty awesome.
Nowadays I think the 65c02 is obsolete. The 65c02 is designed for big programs (such as my symbolic-math program). It has the (zp),Y addressing-mode that provides good access to big data-structures. Nobody wants to use an 8-bit processor for big programs now though. People use 16-bit or 32-bit processors for big programs now.
In its day, the 6502 was better than the Z80 for big programs because (zp),Y provided better access to data-structures. The 6502 was better than the Z80 for small micro-controller programs too, because the 6502 had less interrupt-latency due to having only a few 8-bit registers to save/restore.
Both the 6502 and the Z80 were obsoleted by the MC6809 though, and the MC6809 was obsoleted by the MC68000-family, and the MC68000-family was obsoleted by the ARM --- the ARM Cortex dominates now, and will likely to continue doing so all the way through the 21st century (it is more than powerful enough for any micro-controller application) --- nobody is going to be interested in an 8-bit game-machine now.
People still use 8-bit processors for small micro-controller programs. The 8051-family has been popular for three decades, and continues to be popular. The STM8 is gaining some steam. I designed my 65ISR for small micro-controller programs. It should be significantly easier to program than the 8051-family, that lacks an indexed addressing-mode. It should be more efficient than the STM8 because the STM8 gets bogged down in entrance and exit code for ISRs --- the 65ISR (as the name indicates) is intended to have very little interrupt latency (it has less than the 65c02, and low interrupt-latency was always the hallmark of the 65c02).
BigDumbDinosaur wrote:
As a well-written Forth kernel tends to be quite efficient (efficiency being one of the hallmarks of Forth), I'd expect that a kernel optimized for the 65C816 running in native mode would see a substantial performance gain.
It is obvious that William Mensch wanted to support C programming, so he added the offset,S and (offset,S),Y addressing-modes, but he was failing to support Forth.
If he doesn't want to support Forth, then I don't want to support his 65c816.
The 65c816 ISA was designed to support C, so supporting Forth is going to require contorted programming. I would do that if I was getting paid --- I'll do most anything for money, so long as it is not immoral or illegal, and I'm totally okay with implementing bad ideas --- do you want to hire me to write a Forth for the 65c816?
You've already said on this forum that you will never be convinced that Forth has any value, so I'm expecting that you are not going to hire me --- you don't actually believe what you said: "a well-written Forth kernel tends to be quite efficient (efficiency being one of the hallmarks of Forth)"