Review of 65C816
Review of 65C816
I am finishing up a 6502 based computer and am interested in maybe using the 65C816 for my next computer project. Anyone who has used it can you tell me what you think of it. Is it a good processor?

- GARTHWILSON
- Forum Moderator
- Posts: 8775
- Joined: 30 Aug 2002
- Location: Southern California
- Contact:
There's quite a bit about the '816 here on the forum, but even finding one of my own posts I was looking for has turned out to be a challenge. (I never did find it.) Anyway, here are a few of the 65816's attractions. (This is definitely not a complete list.)
A, X, and Y can be switched in and out of 16-bit mode at any time. Certain things are a whole lot easier when you can handle 16 bits at a time. My '816 Forth runs 2-3 times as fast as my '02 Forth at the same clock speed, just because each primitive requires so few instructions to get the same job done.
16-bit stack pointer. The stack can occupy any part (or nearly all of) the first 64K of the memory map.
"Zero page" is now "direct page" (DP) because it can be put anywhere in the first 64K of the memory map, instead of being confined to 0000-00FF. It can even straddle page boundaries. In the case of multitasking, each task can have its own zero-page-like access that won't interfere with other tasks'.
It has quite a few more instructions and addressing modes. Block moves can be accomplished without a loop if you set up A, X, and Y and then use MVN or MVP which can move up to 64K at a time with the one instruction, taking 7 clocks per byte. (This can be used to fill blocks of memory too, and interrupts are not forced to wait until the completion of the move or fill.)
For more, especially the things that make the '816 better for multitasking, see my post that starts about 2/3 of the way down the page at http://www.6502.org/forum/viewtopic.php?t=50 .
A, X, and Y can be switched in and out of 16-bit mode at any time. Certain things are a whole lot easier when you can handle 16 bits at a time. My '816 Forth runs 2-3 times as fast as my '02 Forth at the same clock speed, just because each primitive requires so few instructions to get the same job done.
16-bit stack pointer. The stack can occupy any part (or nearly all of) the first 64K of the memory map.
"Zero page" is now "direct page" (DP) because it can be put anywhere in the first 64K of the memory map, instead of being confined to 0000-00FF. It can even straddle page boundaries. In the case of multitasking, each task can have its own zero-page-like access that won't interfere with other tasks'.
It has quite a few more instructions and addressing modes. Block moves can be accomplished without a loop if you set up A, X, and Y and then use MVN or MVP which can move up to 64K at a time with the one instruction, taking 7 clocks per byte. (This can be used to fill blocks of memory too, and interrupts are not forced to wait until the completion of the move or fill.)
For more, especially the things that make the '816 better for multitasking, see my post that starts about 2/3 of the way down the page at http://www.6502.org/forum/viewtopic.php?t=50 .
The 65816 is okay. The (NMOS) 6502 and 65C02 are better, cleaner designs, though. Among the positives are the additional features, like the ability to work with 16 bits at a time, new addressing modes (like stack relative), new instructions (like BRL and PER), and the ability to locate the direct page anywhere in bank zero.
Among the negatives are the caveats (as the 65816 datasheet calls them) of which there are several. Most are the result of using new instructions and addressing modes in emulation mode (note that it doesn't emulate either the 6502 or the 65C02 exactly, even if when confined only to instructions and addressing modes of the 6502 and 65C02), but even in native mode, there are things like the program counter wrapping on a bank boundary. So, for some things it acts like there's a 24-bit address space, and for others it acts like a there's a banked 16-bit address space. From a hardware perspective, multiplexing the bank address on the data bus has had its disadvantages.
Among the negatives are the caveats (as the 65816 datasheet calls them) of which there are several. Most are the result of using new instructions and addressing modes in emulation mode (note that it doesn't emulate either the 6502 or the 65C02 exactly, even if when confined only to instructions and addressing modes of the 6502 and 65C02), but even in native mode, there are things like the program counter wrapping on a bank boundary. So, for some things it acts like there's a 24-bit address space, and for others it acts like a there's a banked 16-bit address space. From a hardware perspective, multiplexing the bank address on the data bus has had its disadvantages.
dclxvi wrote:
The 65816 is okay.
Quote:
Among the negatives are the caveats (as the 65816 datasheet calls them) of which there are several.
Quote:
Most are the result of using new instructions and addressing modes in emulation mode (note that it doesn't emulate either the 6502 or the 65C02 exactly, even if when confined only to instructions and addressing modes of the 6502 and 65C02)
Quote:
there are things like the program counter wrapping on a bank boundary. So, for some things it acts like there's a 24-bit address space, and for others it acts like a there's a banked 16-bit address space.
If you need more than 64K of code in a single program image, you're almost certainly doing something wrong. However, it would not be hard for a compiler to conveniently introduce long jumps to the next bank whenever it approached the end of a code bank. Assuming the best-case scenario, 32764 2-byte opcodes taking 65524 clock cycles, the 6-cycle long jump is not going to introduce any noticable delay. Also, compilers should be smart enough not to span loops across bank boundaries.
The reason these design decisions were made is simply this: it required the absolute minimal changes to the 6502 core. To achieve the absolute maximum return on investment, you need to minimize your spending, and the best way to do that is to reuse your existing infrastructure as much as possible.
Be prepared for more "wonkiness" with the Terbium. The more I think about how to achieve Terbium's stated goals, it's looking more and more like my predictions are accurate, such as the use of a 16-bit byte. This is another approach towards expanding the 6502 capabilities with absolute minimum changes to the core. Whenever I try to explain the concept of widening the byte from 8-bits to 16-bits, people's eyes glass over. This is a pity, because a byte is formally defined simply as the smallest *addressable* unit of memory, and has nothing to do intrinsically with its inherent width.
Quote:
From a hardware perspective, multiplexing the bank address on the data bus has had its disadvantages.
However, nothing says you HAVE to use these features at all. My current Kestrel design, Kestrel 1, currently ignores the bank address byte outright, effectively using the 65816 processor as a glorified 65802. Though, I personally look forward to exploiting the presence of the bank address byte on the data bus in my next Kestrel design.
Memblers wrote:
I think it's great, even if you treat it like a normal 6502 with 16-bit index regs. The only tricky part for me was switching often between 8 and 16-bit accumulator. With more flexibility comes more ways to make mistakes. 
(And then, after that, there was the minor issue of the assembler not reporting an error for an unsupported opcode form which, though documented as a valid alias, the assembler still didn't recognize. In particular, I'm talking about TAS versus TCS. TAS did nothing, and even failed to emit an opcode byte. TCS worked. Go figure.)
- GARTHWILSON
- Forum Moderator
- Posts: 8775
- Joined: 30 Aug 2002
- Location: Southern California
- Contact:
kc5tja wrote:
Okay? Is that it?
kc5tja wrote:
I haven't seen any so-called caveats in any of my datasheets.
http://www.6502.org/documents/datasheet ... b_2004.pdf
and:
http://www.westerndesigncenter.com/wdc/ ... 5c816s.pdf
(Actually, those are the 65802 caveats, but they also pertain to the 65816.) This leads to today's 65816 trivia question:
If e=1 and S=$00 (i.e. $0100) when (i.e. before) the sequence PHB PLB is executed, what is the (24-bit) address that PHB writes to and what is the address that PLB reads from? Bonus points if you can find where WDC's documentation tips you off about this.
kc5tja wrote:
As far as I can recall, the 65816 emulates a 65C02 exactly when in emulation mode, judging by what is in my 65816 book here.
Code: Select all
SED
SEC
LDA #$20
SBC #$0F
kc5tja wrote:
Actually, it always acts like there is a banked 16-bit address space. Always. The 24-bit long-address instructions are included as a programmer convenience, and really are the only instructions that can linearly address all 16MB.
kc5tja wrote:
If you need more than 64K of code in a single program image, you're almost certainly doing something wrong.
kc5tja wrote:
However, it would not be hard for a compiler to conveniently introduce long jumps to the next bank whenever it approached the end of a code bank.
Is there really a benefit to having the program counter wrap on a bank boundary? Sure, it saves a little silicon because the 65816 does not have to increment (or, when branching backward, decrement) K (the PBR), but to me this is just creating an idiosyncrasy that you have to program around.
kc5tja wrote:
Not disadvantages, but challenges. I agree that it requires more parts if you want to exploit this feature. If you meet the challenges, there is no real difference in operation between a 6502 and a 65816. The bus tenures are exactly the same.
All in all, I think the 65816's advantages outweigh the disadvantages (yes, I actually like the thing), but I was anticipating gushing responses to the original post, and I thought I'd play devil's advocate a bit in the interest of balance.