I started writing my own 6502 simulator for my new hardware project and after looking at al those decimal ADC/SBC rants I decided to do it the right way.
Instead of going "Intel" way (that's what most of 6502 simulators do), I looked at the 6502 schematic and MOS Tech "decimal" patent. There is no schematic available for 65C02, but I think the whole stuff, including the 6502/65C02 difference may be explained quite easily and without any magic.
The key notes are:
1. Two circuits are involved in decimal arithmetics: binary adder (ALU) with BCD carry propagation and a separate decimal adjust circuit.
2. For the ALU, the decimal mode is activated ONLY during ADC. During "decimal" SBC, the operation of ALU is exactly the same as in binary mode - its a standard binary addition, only the 2nd arguments is bit-negated.
3. During ADC, the ALU performs binary addition with a modified carry propagation. Carry from bit 3 to bit 4 is set if the result in bits 3..0 exceeds 9. This carry (AC signal) is used as bit 4 input carry and controls the operation of an adjuster. Carry from bit 7 is set if the result in bits 7..4 exceeds 9. This goes to C flag (same in NMOS and CMOS). This carry value, although obtained by a slightly modified binary adder, is a proper decimal carry value, assuming the source arguments were legal BCD. During SBC the AC and C values are standard binary carries (= negated borrows).
4. V flag in both NMOS and CMOS is set in a standard way - as a XOR between carry from bit 6 to 7 and the C flag. Its value has nothing to do with the final BCD result since it's NOT based on BCD result - it's generated by ALU.
5. The ALU output value (pure binary for SBC, tricky for ADC) goes to the correction circuitry, which is imply a pair of reduced binary adders, one for bits 3..0, another for 7..4 (actually bits 0 and 4 pass transparently), with NO CARRY PROPAGATION between them (since the BCD carry was already accounted for by ALU). The adders are activated respectively by AC (3..0) and C (7..4) signals from ALU.
6. During decimal addition, the adder adds 6 to a nibble if the control carry input is 1 (there is a carry out of a digit generated in ALU).
7. During decimal SBC, the adder adds 10 to a nibble if the control carry input is 0 (there is a borrow).
8. Now the NMOS/CMOS difference: N and Z in NMOS are set based on uncorrected ALU output, while in CMOS version these flags are set based on the corrected value. C and V behave in the same way in both versions.
(I did not verify the V generation algorithm yet, but looking at the diagram I can't see any other way of setting V.)
I will post my C code for ADC and SBC after I test it thoroughly.
I know, I am re-inventing the wheel, but it's a new re-invention...