6502 redundant, missed, and suggested features

Programming the 6502 microprocessor and its relatives in assembly and other languages.
Post Reply
User avatar
Arlet
Posts: 2353
Joined: 16 Nov 2010
Location: Gouda, The Netherlands
Contact:

Re: 6502 redundant, missed, and suggested features

Post by Arlet »

Probably they didn't feel like adding too much detail to the text, when the description below shows exactly what's happening.

Or, the designers intended to use the PC, and get a full 16 bit increment for free, but ran into some problems and switched to ALU increment late in the design process.
User avatar
BigEd
Posts: 11464
Joined: 11 Dec 2008
Location: England
Contact:

Re: 6502 redundant, missed, and suggested features

Post by BigEd »

My suspicion is that by the time that manual was written, some knowledge was lost.

The usual use case for JMP(abs) is surely a single JMP to indirect a vector. It's unlikely that the vector will be placed at the end of a page, so almost everyone will be unaware of the bug.

Edit: I see that indirect threaded code could possibly hit this, if it didn't take care.
User avatar
BigEd
Posts: 11464
Joined: 11 Dec 2008
Location: England
Contact:

Re: 6502 redundant, missed, and suggested features

Post by BigEd »

Just for historical interest, I had a look to see what's the earliest mention of this bug, and the earliest I found is a note by Heinz J Schilling in 6502 User Notes newsletter, issue 15, June 1979.
http://archive.6502.org/publications/65 ... df#page=24
http://www.classiccmp.org/cini/pdf/KIM% ... df#page=24

It would be interesting to hear of an earlier mention.

Edit: I see that Bob Sander-Cederlof's newletter "Apple Assembly Lines" trumpets this bug note in the first issue, October 1980. I'm thinking therefore that it would not already be common knowledge.
Quote:
There is an error in the JUMP INDIRECT instruction of ALL 6500 family CPU chips, no matter where they were made. This means the error is present in ALL APPLES. This fatal error occurs only when the low byte of the indirect pointer location happens to be $FF, as in JMP ($08FF). Normally, the processor should fetch the low-order address byte from location $08FF, increment the program counter to $0900, and then fetch the high-order address byte from $0900. Instead, the high-order byte of the program counter never gets incremented! The high-order address byte gets loaded from $0800 instead of $0900! For this reason, your program should NEVER include an instruction of the type JMP ($xxFF).
(Again, a confusion between the address bus and program counter!)

Edit: downthread, BDD notes that Leventhal's 1979 book describes the misbehaviour, and it looks like the book was finalised no earlier than April 1979.
Last edited by BigEd on Fri Aug 26, 2016 6:40 am, edited 2 times in total.
User avatar
GARTHWILSON
Forum Moderator
Posts: 8773
Joined: 30 Aug 2002
Location: Southern California
Contact:

Re: 6502 redundant, missed, and suggested features

Post by GARTHWILSON »

litwr wrote:
BigDumbDinosaur wrote:
If using the 65C02 or 65C816, there is the JMP (<addr>,X) instruction, which requires no preparation of any jump vectors. You use it with a table containing 16 bit addresses in little endian order. The following code does the work:
This code requires 16 bit in XR. So it is impossible for 65C02. It may be used with 65816 but requires rare 16-bit index registers mode.
It can be used with the index registers in 8-bit mode as well.  The Eyes & Liechty programming manual specifically shows it in both 8- and 16-bit mode, on page 382 of my old paper copy.  (The .pdf that was distributed until early last year didn't have the same page numbers.)
Quote:
GARTHWILSON wrote:
JMP (<addr>) is not a ZP instruction.
It is not any kind of 6502 absolute address too. JMP (addr) is the special addressing mode for this instruction only.
The 65816 adds JSR (<addr>,X) as well.

Most of the discussion of the JMP (<addr>,X) bug has been focused on using a table, which requires a jury-rigged operation on the NMOS anyway, unlike the CMOS.  Whether using a table or not, you shouldn't have to put assembler directives before the address to prevent the page boundary straddling or add another byte at the beginning of the next page.  It should just work, and on the CMOS, it does.

Quote:
Almost all of its additional instructions have very little importance.  They may make codes slightly faster and smaller but in the completely tiny scale.  I can estimate less than 2% smaller size and less than 1% faster speed.
Having the better instruction set makes it easier to program too.  I can't think of much the CMOS can do that the NMOS can't do at all, but it's a pain on the NMOS when for example you need to save X without disturbing A, since you can't do PHX and PLX, or zero a memory location without disturbing A, since there's no STZ, or need an indirect addressing mode without disturbing X or Y. Going further, the '816 further increases ease of programming beyond the 65c02.
Quote:
I want only BIT #imm and DEA, INA of its instructions.  Even the useful BIT #imm can't set N and V flags. :( Sometimes 65c02 is even slower than 6502. :( The major evil of CMOS 6502 is the occupation of the valuable opcodes by these unimportant instructions.  This had halted the natural development of 6502.  65816 and even 4510 had to follow this heavy and bad inheritance. :( Instead of these "occupants" maybe placed much more powerful instructions: 16 bit arithmetic, POP XY, PUSH XY, work with two (or more) segment registers (it might give short and fast operations with 20 or 24 address bus and the relocation of codes), 16-bit accumulator, maybe another 16 bit accumulator, etc.
The '816 has most of these, plus instructions and addressing modes that are totally impractical to do on the '02 at all.  (Keeping with the topic title), PEA for example is a three-byte instruction that pushes a two-byte literal (its operand), which is typically an address but it can also be data, onto the stack, without affecting the processor registers.  One use of it is to pass data to a subroutine.  For a 6502 to synthesize it requires six instructions, and more if you need to save A.  It's a similar story for PEI.  PER requires 18 6502 instructions to synthesize (and more if you need to save A), 11 of those being in a subroutine.  BRL (Branch Relative Long) and a four-byte (two-instruction) BSR (Branch to SubRoutine, or Branch, Saving Return address) with a 16-bit relative address are valuable, especially in relocatable code.  So are the extra stack-addressing modes and the 16-bit stack pointer permitting much heavier use of the hardware stack for passing lots of parameters, as in C or recursive functions that may run the '02 out of stack space.  Many features of the '816 make it far more suitable for multitasking.
Quote:
I have also to note that we have tens of thousands (or even more) ML programs for NMOS 6502.
If everything has to be pulled down to the weakest version, what's the sense in making any improvements??  If you're writing software for something like the venerable C64, using C64 kernel entry points and hardware, then by all means, avoid the extra CMOS instructions and addressing modes.  It makes total sense.  There never was a CMOS 6510 anyway.  But for new builds, there's no sense in using the NMOS, and I will never go back to it.
BigEd wrote:
Just for historical interest, I had a look to see what's the earliest mention of this bug, and the earliest I found is a note by Heinz J Schilling in 6502 User Notes newsletter, issue 15, June 1979.
Thanks, Ed.
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?
User avatar
KC9UDX
Posts: 246
Joined: 07 Dec 2013
Location: The Kettle Moraine

Re: 6502 redundant, missed, and suggested features

Post by KC9UDX »

For some time I've thought about a 65c02 upgrade board for the C64.

Someday, I will. It will break a lot of software, but for my purposes, I won't be too concerned.

If I get time this winter, I'd like to put a 65c02 in a PET and see what the consequences are. Unfortunately most of the PET software I have already doesn't work, so I might not find out.

I did have possession of a couple Apple //es that were dealer "enhanced." I don't have any recollection who the dealer was, but apparently they had forged ROMs. Visually they looked just like Apple ROMs should look, but they must have relied on the 65c02. Several times over the years I put known good NMOS 6502s in them and they both failed to boot. Unfortunately I think I sold both of those machines in recent years.
rwiker
Posts: 294
Joined: 03 Mar 2011

Re: 6502 redundant, missed, and suggested features

Post by rwiker »

KC9UDX wrote:
I did have possession of a couple Apple //es that were dealer "enhanced." I don't have any recollection who the dealer was, but apparently they had forged ROMs. Visually they looked just like Apple ROMs should look, but they must have relied on the 65c02. Several times over the years I put known good NMOS 6502s in them and they both failed to boot. Unfortunately I think I sold both of those machines in recent years.
The Apple IIe enhanced is *exactly* an Apple IIe where the processor has been upgraded to a 65c02, and the ROMs rewritten to use the new opcodes (and in the process, freeing up enough space to add functionality - notably, the mini assembler).

See https://en.wikipedia.org/wiki/Apple_IIe#Enhanced_IIe.
User avatar
KC9UDX
Posts: 246
Joined: 07 Dec 2013
Location: The Kettle Moraine

Re: 6502 redundant, missed, and suggested features

Post by KC9UDX »

rwiker wrote:
KC9UDX wrote:
I did have possession of a couple Apple //es that were dealer "enhanced." I don't have any recollection who the dealer was, but apparently they had forged ROMs. Visually they looked just like Apple ROMs should look, but they must have relied on the 65c02. Several times over the years I put known good NMOS 6502s in them and they both failed to boot. Unfortunately I think I sold both of those machines in recent years.
The Apple IIe enhanced is *exactly* an Apple IIe where the processor has been upgraded to a 65c02, and the ROMs rewritten to use the new opcodes (and in the process, freeing up enough space to add functionality - notably, the mini assembler).

See https://en.wikipedia.org/wiki/Apple_IIe#Enhanced_IIe.
That's what I always believed until a few years ago.

If you have an Enhanced //e, you can take the 65c02 out and put in a 6502, and it will run all day long. Except for the two that I had, and maybe others, but I've not heard of anyone else with this situation.

At least this is what has been reported to me by other owners who were perplexed when I told them that the Enhanced ROM requires a 65c02.
litwr
Posts: 188
Joined: 09 Jul 2016

Re: 6502 redundant, missed, and suggested features

Post by litwr »

GARTHWILSON wrote:
It can be used with the index registers in 8-bit mode as well. The Eyes & Liechty programming manual specifically shows it in both 8- and 16-bit mode, on page 382 of my old paper copy. (The .pdf that was distributed until early last year didn't have the same page numbers.)
How the code below can be used with 8-bit XR?

Code: Select all

         lda #index            ;zero-based routine index
         asl a                 ;double it
         tax                   ;now absolute index
         jmp (table,x)         ;goto routine
it may work only with 128 entries instead of 256...
BigEd wrote:
I see that it's inconvenient to have to account for the difference between the 02 and the C02. But, I think it may still be possible to use a 257 byte table which will suit both CPUs?
It is exactly the current code for 6502. One man spent hours trying to find out why this program working at C64 is not working with SuperCPU (65816) - http://www.lemon64.com/forum/viewtopic. ... art=19#top. So I had to add one byte for 65C02 "feature". Of course,

Code: Select all

        ldx divisor
        jmp (divjmp,X)
is better than the code for NMOS 6502. It is 3 bytes shorter and 3 cycles faster. This discussion helps me to realize this. 8) So it is the way I should use to prepare the specialized 65C02 version for BBC Micro. It is the only advantage of 65C02 usable in the spigot but the byte division is not exactly in the main loop so the advantage in speed will be less than ≈0.1%.
GARTHWILSON wrote:
The '816 has most of these, plus instructions and addressing modes that are totally impractical to do on the '02 at all. (Keeping with the topic title), PEA for example is a three-byte instruction that pushes a two-byte literal (its operand), which is typically an address but it can also be data, onto the stack, without affecting the processor registers. One use of it is to pass data to a subroutine. For a 6502 to synthesize it requires six instructions, and more if you need to save A. It's a similar story for PEI. PER requires 18 6502 instructions to synthesize (and more if you need to save A), 11 of those being in a subroutine. BRL (Branch Relative Long) and a four-byte (two-instruction) BSR (Branch to SubRoutine, or Branch, Saving Return address) with a 16-bit relative address are valuable, especially in relocatable code. So are the extra stack-addressing modes and the 16-bit stack pointer permitting much heavier use of the hardware stack for passing lots of parameters, as in C or recursive functions that may run the '02 out of stack space. Many features of the '816 make it far more suitable for multitasking.
You'd written about 65816 instructions. They are good and powerful indeed. I can only think that they can be better and faster. I bet for the segment registers, for example. I also bet for Z register of 4510 - it is much better the plain (zp) mode.
GARTHWILSON wrote:
BigEd wrote:
Just for historical interest, I had a look to see what's the earliest mention of this bug, and the earliest I found is a note by Heinz J Schilling in 6502 User Notes newsletter, issue 15, June 1979.
Thanks, Ed.
6502 development was beheaded so without "political cover" it was easily influenced by men who did not think primarily about this development but might have other aims. :(
kakemoms
Posts: 349
Joined: 02 Mar 2016

Re: 6502 redundant, missed, and suggested features

Post by kakemoms »

litwr wrote:
IMHO LSR4 instruction would be worth to mention too. 6502 requres 4 LSR to get the higher nibble. The lower nibble can be get by AND #15. So what's the purpose of SWN (swap nibbles)? It is better to have z80 RLD which allows to make fast 4-bit shift of the sequence of bytes.
I just use tables for whatever function that's missing. You can often combine it with some operand (ORA, AND) to shave off cycles.

For example 4bit*4bit math (x*y):

Code: Select all

TXA
AND #$0F
ORA ShiftLtoH,Y
TAX
LDA MultTable,X
14 cycles. Two 256 byte tables. And it disregards the upper 4 bits (if they are present).

Without the shift table:

Code: Select all

TXA
ASL
ASL
ASL
ASL
STA $zp1
TYA
AND #$0F
ORA $zp1
TAX
LDA MultTable,X
26 cycles. One 256 byte table. If you had a swap function you could shave off 6 cycles, but it would still be faster with the extra table.
White Flame
Posts: 704
Joined: 24 Jul 2012

Re: 6502 redundant, missed, and suggested features

Post by White Flame »

BigEd wrote:
I see that it's inconvenient to have to account for the difference between the 02 and the C02. But, I think it may still be possible to use a 257 byte table which will suit both CPUs?
Depending on how crucial speed is, I would put the compatibility into the code, not into the data structure. If the index is always odd, then a simple DEX brings it back into 0-254 range with even numbers and there's no page overflow.
litwr
Posts: 188
Joined: 09 Jul 2016

Re: 6502 redundant, missed, and suggested features

Post by litwr »

Klaus2m5 wrote:
However, it is bold to tie the assumption of a bug in the 65c02 to the very special case that you are talking about. It is like calling the missing undocumented opcodes in the 65C02 a bug. I am sure, that there are much more coders having been bitten by the lack of the carry into the upper byte of the indirect address, than there are coders facing the same problem as you.
Thanks again for working with my code. :) However I don't see any "the very special case" - it is just an ordinary code, natural for this case. I can even assume that there are no coders at all who were "bitten by the lack of the carry into the upper byte of the indirect address". Could you show any practical example which shows the situation in reverse? IMHO The problem of JMP (xxFF) is completely contrived and artificial.
GARTHWILSON wrote:
If everything has to be pulled down to the weakest version, what's the sense in making any improvements?? If you're writing software for something like the venerable C64, using C64 kernel entry points and hardware, then by all means, avoid the extra CMOS instructions and addressing modes. It makes total sense. There never was an NMOS 6510 anyway. But for new builds, there's no sense in using the NMOS, and I will never go back to it.
My point is in fact that the writers of this enormously big amount of software never complained about JMP (xxFF). Another my point that the advancement from NMOS 6502 to CMOS was very tiny and it has even several back steps (the creation of JMP (xxFF) incompatibility, the occupation of valuable opcode space by the unimportant instructions, slow down the redundant BCD mode, ...) that it can't be called the advancement at all.
kakemoms wrote:
26 cycles. One 256 byte table. If you had a swap function you could shave off 6 cycles, but it would still be faster with the extra table.
I'd only written that LSR4 maybe almost always used instead of SWN. SWN might be realized easier though...
White Flame wrote:
Depending on how crucial speed is, I would put the compatibility into the code, not into the data structure. If the index is always odd, then a simple DEX brings it back into 0-254 range with even numbers and there's no page overflow.
The fastest speed for different platforms is the aim of the spigot project. :D
User avatar
KC9UDX
Posts: 246
Joined: 07 Dec 2013
Location: The Kettle Moraine

Re: 6502 redundant, missed, and suggested features

Post by KC9UDX »

Nobody ever complained, because everyone was aware of the behaviour, and acted accordingly.
User avatar
BigDumbDinosaur
Posts: 9427
Joined: 28 May 2009
Location: Midwestern USA (JB Pritzker’s dystopia)
Contact:

Re: 6502 redundant, missed, and suggested features

Post by BigDumbDinosaur »

litwr wrote:
IMHO The problem of JMP (xxFF) is completely contrived and artificial.
I was aware of the JMP ($xxFF) nearly 40 years ago and, in fact, tripped over it back then. It is a real bug and everyone who has professionally developed for the 6502 (not the CMOS hardware) knows it's a real bug. Using Commodore as example, in page $03 are the kernel and BASIC indirect vectors. CBM page-aligned those vectors precisely because of JMP ($xxFF). They knew it was a bug back when the paint on the PET 2001 was still drying.

Methinks you are beating a dead horse. :)
Quote:
GARTHWILSON wrote:
If everything has to be pulled down to the weakest version, what's the sense in making any improvements?? If you're writing software for something like the venerable C64, using C64 kernel entry points and hardware, then by all means, avoid the extra CMOS instructions and addressing modes. It makes total sense. There never was an NMOS 6510 anyway. But for new builds, there's no sense in using the NMOS, and I will never go back to it.
My point is in fact that the writers of this enormously big amount of software never complained about JMP (xxFF). Another my point that the advancement from NMOS 6502 to CMOS was very tiny and it has even several back steps (the creation of JMP (xxFF) incompatibility, the occupation of valuable opcode space by the unimportant instructions, slow down the redundant BCD mode, ...) that it can't be called the advancement at all.
I'm sure Apple didn't agree with you about the 65C02 when they started using it in place of the NMOS part. In fact, they disagreed even more when the 65C816 found its way in the Apple ][gs.
x86?  We ain't got no x86.  We don't NEED no stinking x86!
User avatar
GARTHWILSON
Forum Moderator
Posts: 8773
Joined: 30 Aug 2002
Location: Southern California
Contact:

Re: 6502 redundant, missed, and suggested features

Post by GARTHWILSON »

litwr wrote:
My point is in fact that the writers of this enormously big amount of software never complained about JMP (xxFF).
On the contrary, it was a huge troubleshooting problem to a few early on, before it was documented.  After it was documented, people knew to take measures to keep the indirect address from straddling the page boundaries.  The CMOS version fixed all the NMOS bugs.
Quote:
Another my point that the advancement from NMOS 6502 to CMOS was very tiny and it has even several back steps (the creation of JMP (xxFF)
Just stop, please.  It has been made clear by several knowledgeable people here that NMOS had a big bug in JMP (xxFF).  It does it wrong!
Quote:
the occupation of valuable opcode space by the unimportant instructions,
Even if they were unimportant (which I totally disagree with), what does the taking of "valuable" op code space matter?  The op code table was nowhere near full in the CMOS either.
Quote:
slow down the redundant BCD mode,
again to fix an NMOS bug, which was that its flags were not valid after a decimal-mode operation!
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?
User avatar
Dr Jefyll
Posts: 3526
Joined: 11 Dec 2009
Location: Ontario, Canada
Contact:

Re: 6502 redundant, missed, and suggested features

Post by Dr Jefyll »

litwr wrote:
Could you show any practical example which shows the situation in reverse?
Fair enough. First let's notice that in your application the 16-bit destination values accessed by JMP (abs) are stored together, adjacent to one another in a table. But in other circumstances a programmer may wish to store the destination values separately from one another, with other fields (of varying length) in between. IOW you might have a destination value, then one or more unrelated fields of arbitrary length, then another destination value, then some other unrelated fields, and so on. It's not uncommon. The dictionary used by Forth is an example of this.

When creating a mixed data structure like this it's desirable to freely allocate space exactly as needed. But if you allocate space exactly as needed then occasionally a destination value will straddle a page boundary. With NMOS 6502 this is a dangerous anomaly.

In the bad old days folks were forced to check for this condition every time before allocating space for a destination value. If the next available space happened to be at $xxFF then remedial action was required, such as sticking an extra, unused byte into the structure -- ie, wasting the byte at $xxFF -- to ensure proper alignment. Does that sound messy? It was!

Now we have the 'C02, and no anomalies. JMP (abs) just works. :mrgreen:
In 1988 my 65C02 got six new registers and 44 new full-speed instructions!
https://laughtonelectronics.com/Arcana/ ... mmary.html
Post Reply