Quote:
BLT - Branch Less Than (C=0 or Z=1)
BGT - Branch Greater Than (C=1 and Z=0)
Signed or unsigned comparison? As things stand now signed comparisons are always a tricky thing on the 6502. You have to use that mysterious V-flag if you want to do signed. It's difficult enough to grasp what's going on that a lot of people stay away from it. Perhaps explicitly signed comparison, subtraction and addition instruction branch instrctions would be helpful as well.
Quote:
RDL - Rotate Direct Left (rotate without carry, i.e. bit 7 is directly moved to bit 0)
RDR - Rotate Direct Right (rotate without carry)
ASR - arithmetic shift right (shift in the sign from the left)
I like these. They can be synthesized already by interspersing CMP immediates among ROL/ROR sequences, but they're occasionally nice to have.
Quote:
SWP - swap upper and lower nibble
This would be extremely handy. This might be the best addition of them all.
Quote:
INV - two's complement
I dunno. It's easy enough already to do "EOR #$FF". I don't know that I want to do this often enough to make it worthwhile as a separate instruction. Is there some address mode other than accumulator where this is useful?
Quote:
BCN - bit count, compute number of 1-bits
This would take a (very) short routine to duplicate, and the answer would come in log(N) time. Or a big lookup table and constant time. Is there enough need to make it a separate instruction? I can see that it would be handy if you wanted to do parity checks, but beyond that I'm stumped. If parity is all it's good for, why not just a parity status flag?
Quote:
SXY - swap X and Y
Mmm. Maybe. All I can think of is setting up registers prior to subroutine (or OS) calls, but it might be handy at that.
Quote:
PSH - push all registers (in a 6502 that would be AC, XR, YR, more in a 65002)
PLL - pull all registers (in a 6502 that would be AC, XR, YR, more in a 65002)
So the other day I was thinking "What if each register had its own dedicated on-chip stack space?" Ie, N registers, N stacks And then I thought "What would that possibly be good for?" And then I thought "Interrupts!". I was thinking the processor would do it automatically upon recognizing an interrupt, so an ISR could start out trashing any register it wanted. RTI would restore them all. If each stack was 8- or 16-deep, you could even deal with re-entrant interrupts. The stacks would not be user-accessible, just places to stash register values during interrupts.
But an instruction to do it would be just as useful. The idea of multiple dedicated stacks would of course be to minimize response time to an interrupt. I don't think these would be as useful for calling subroutines, mainly because of the difficulty of returning results in registers (if the callee did PSH at the start then PLL at the end would trash anything you put in them...unless maybe the caller did PSH? Then after the callee returns, save any results passed in registers before the caller does PLL...might work. Okay, I like it)
Quote:
FIL - fill a memory area with a byte value
Again a short subroutine. Is the speed advantage (one or two cycles per byte?) worth it?
Quote:
ADS - add value to stack pointer
SBS - substract value from stack pointer
Fun. I like these, although playing with stack pointer is not for novices.
Quote:
MVN/MVP - move a memory area (see 65816)
I never liked this instruction much. Kind of ugly. The coolest answer to this in the 6502 world I ever saw were the Commodore RAM expanders. That REC chip could do transfers and fills at a byte per clock cycle, and swap two bytes in two clock cycles.
Quote:
HBS - determine bit number of highest bit that is set (like log2)
HBC - determine bit number of highest bit that is clear
BSW - bit swap - exchange bit 7 with bit 0, bit 6 with bit 1, etc
How is the result returned, BTW? What if no bit is set? If bits are numbered 7..0 then zero is not an answer to that. I guess I'm having trouble seeing how useful they'd be, ie., what problem do they solve?