My new verilog 65C02 core.
Re: My new verilog 65C02 core.
Good find, and accurate debugging. Latest github has a fix.
The same bug may be present in other instructions as well. I'll have to check. The problem was that the PC wasn't updated in the last cycle of the JMP (IND). This is only an issue when instruction is directly followed by an interrupt, so it doesn't show up in the test suite.
Edit: I've checked the microcode, and for every instruction, the last cycle loads the PC, so should be good now.
The same bug may be present in other instructions as well. I'll have to check. The problem was that the PC wasn't updated in the last cycle of the JMP (IND). This is only an issue when instruction is directly followed by an interrupt, so it doesn't show up in the test suite.
Edit: I've checked the microcode, and for every instruction, the last cycle loads the PC, so should be good now.
Re: My new verilog 65C02 core.
Arlet wrote:
Good find, and accurate debugging. Latest github has a fix.
It run 14.5% faster just throught improved CPI: But for some reason the system is hanging when I run the DORMANN tests.
I'll see if I can debug this now.
Dave
Re: My new verilog 65C02 core.
hoglet wrote:
But for some reason the system is hanging when I run the DORMANN tests.
The trace is a bit strange. Ignore the addresses, as they are bogus. They are inferred by my 6502Decoder tool, and it's getting a bit confused at the different data bus cycle order from JSR and RTS.
Here's a fragment of code that works:
Code: Select all
C21C : A5 53 : LDA 53 : 3
C21E : 79 04 31 : ADC 3104,Y : 5
C221 : 08 : PHP : 3
C222 : C5 55 : CMP 55 : 3
C224 : F0 03 : BEQ C229 : 2
C226 : 68 : PLA : 3
C227 : 29 01 : AND #01 : 2
C229 : C5 56 : CMP 56 : 3
C22B : F0 03 : BEQ C230 : 2
C22D : 28 : PLP : 3
C22E : 08 : PHP : 3
Code: Select all
C21C : A5 53 : LDA 53 : 3
C21E : 79 04 31 : ADC 3104,Y : 5
C221 : 08 : PHP : 3
C222 : C5 55 : CMP 55 : 3
C224 : F0 03 : BEQ C229 : 2
C226 : 68 : PLA : 3
C227 : 29 01 : AND #01 : 2
C229 : C5 56 : CMP 56 : 2
C22B : 56 56 : LSR 56,X : 2
C22D : 56 56 : LSR 56,X : 2
C22F : 56 56 : LSR 56,X : 2
C231 : 56 56 : LSR 56,X : 2
For some reason the CMP 56 is only taking 2 cycles, where as previously it took 3.
Here's the data bus cycles:
Code: Select all
0 68 1 1 1
1 29 0 1 1
2 f9 0 1 1
C226 : 68 : PLA : 3
0 29 1 1 1
1 01 0 1 1
C227 : 29 01 : AND #01 : 2
0 c5 1 1 1
1 56 0 1 1
C229 : C5 56 : CMP 56 : 2
0 56 1 0 1
1 56 0 1 1
C22B : 56 56 : LSR 56,X : 2
0 56 1 0 1
1 56 0 1 1
C22D : 56 56 : LSR 56,X : 2
0 56 1 0 1
1 56 0 1 1
C22F : 56 56 : LSR 56,X : 2
Edit: Looking at the microcode, it does look like the IRQ handler is missing from the upper bank (0x180-0x1FF). I think it should be there at 0x1E0, but that jumps to the NMI handler.
Dave
Re: My new verilog 65C02 core.
Ok, I can reproduce it in simulation. Same code with CLD works fine, with SED it crashes.
Looks like a design flaw in the decimal/irq/nmi handling. I need to rethink that.
Looks like a design flaw in the decimal/irq/nmi handling. I need to rethink that.
Re: My new verilog 65C02 core.
Ok, it wasn't so bad. It was basically the same bug I had before on my hardware. I had fixed it, but then reintroduced the same bug when I added support for NMI.
Luckily, it was just a matter of microcode updates. Fix in github.
Luckily, it was just a matter of microcode updates. Fix in github.
Re: My new verilog 65C02 core.
Arlet wrote:
Luckily, it was just a matter of microcode updates. Fix in github.
Let me dig a bit further....
Re: My new verilog 65C02 core.
I just checked the V flag handling, and it looks correct to me.
Re: My new verilog 65C02 core.
Here's a trace of this going wrong.
It looks like CLV hasn't cleared the V flag, as evidenced by the BVS E310 being taken, the resulting in a "Bad Command" error.
But this would get picked up by the Dormann tests, so it must be more subtle!
Dave
Code: Select all
0 6a 1 1 1
E7BC : 6A : ROR A : 1
0 28 1 1 1
1 2a 0 1 1
2 b1 0 1 1
E7BD : 28 : PLP : 3
0 2a 1 1 1
E7BE : 2A : ROL A : 1
0 68 1 1 1
1 b8 0 1 1
2 8c 0 1 1
E7BF : 68 : PLA : 3
0 b8 1 1 1
E7C0 : B8 : CLV : 1
0 60 1 1 1
1 a0 0 1 1
2 73 0 1 1
3 e3 0 1 1
E7C1 : 60 : RTS : 4
0 70 1 1 1
1 9a 0 1 1
E374 : 70 9A : BVS E310 : 2
0 00 1 1 1
1 fe 0 1 1
2 e3 0 0 1
3 12 0 0 1
4 f0 0 0 1
5 1c 0 1 1
6 dc 0 1 1
pc: prediction failed at E310 old pc was 73A3
E310 : 00 FE : BRK #FE : 7
But this would get picked up by the Dormann tests, so it must be more subtle!
Dave
Re: My new verilog 65C02 core.
I can reproduce it in sims, but only with the PLA right before the CLV.
Re: My new verilog 65C02 core.
Arlet wrote:
I can reproduce it in sims, but only with the PLA right before the CLV.
Code: Select all
>> d e7bd
E7BD : 28 : PLP
E7BE : 2A : ROL A
E7BF : 68 : PLA
E7C0 : B8 : CLV
E7C1 : 60 : RTS
Re: My new verilog 65C02 core.
Ah, found the problem. In the ALU, I reconstruct the internal carry from bit 7 by XOR of both inputs and the output.
Where 'M' is the last value read from memory. The problem is that I changed the ALU a while ago to use a separate register 'BI' instead of 'M', in order to speed up the BCD logic. I had forgotten to change it in the carry logic as well. Normally, that's not a problem, because M and BI will be the same. The problem is that the CLV is a single cycle instruction, which means that, unlike every other instruction, does not load the M/BI registers itself, but inherits them from the instruction before it. Edit: there are other single cycle instructions, obviously, but none of them use the M/BI register, because there's no time to load them in a single cycle.
Normally, that wouldn't even be a problem, but the CLV uses a trick to load V with the ALU overflow output, instead of clearing it to zero.
Quite subtle indeed, and only a problem for the generic code. The Spartan-6 specific code just grabs the value from the internal carry chain instead.
Fix in github.
Code: Select all
wire BC7 = adder[7] ^ R[7] ^ M[7];
Normally, that wouldn't even be a problem, but the CLV uses a trick to load V with the ALU overflow output, instead of clearing it to zero.
Quite subtle indeed, and only a problem for the generic code. The Spartan-6 specific code just grabs the value from the internal carry chain instead.
Fix in github.
Re: My new verilog 65C02 core.
Arlet wrote:
Fix in github.
Next, to try running at speed!
Re: My new verilog 65C02 core.
I loved playing Planetoid when I was a kid. I didn't have a Beeb myself, but every Saturday I would go to this computer store where they had a bunch of demo machines set up.
Re: My new verilog 65C02 core.
Arlet wrote:
I wrote a little tool to extract the source register used in the first cycle, and show the results in a 16x16 opcode grid.
Code: Select all
S X - - Z Z Z - S - A - - - - -
- Z Z - Z X X - - - A - - - - -
S X - - Z Z Z - S - A - - - - -
- Z Z - X X X - - - A - - - - -
S X - - - Z Z - S - A - - - - -
- Z Z - - X X - - - S - - - - -
S X - - Z Z Z - S - A - - - - -
- Z Z - X X X - - - S - - - - -
- X - - Z Z Z - Y - X - - - - -
- Z Z - X X Y - Y - X - - - - -
- X - - Z Z Z - A - - - - - - -
- Z Z - X X Y - - - - - - - - -
- X - - Z Z Z - Y - - - - - - -
- Z Z - - X X - - - S - - - - -
- X - - Z Z Z - - - - - - - - -
- Z Z - - X X - - - S - - - - -
Code: Select all
case( opcode )
8'h00: out = S;
8'h01: out = X;
8'h04: out = Z;
8'h05: out = Z;
8'h06: out = Z;
8'h08: out = S;
8'h0a: out = A;
8'h11: out = Z;
8'h12: out = Z;
.. etcetera ..
Not only does it save a bunch of work, it should also be bug-free, and easy to modify if desired.
The crux is to identify all the "don't cares". You can't just take the current microcode table and convert it to distributed logic, and hope to get something compact. But it's possible to write a tool that looks at each line of microcode and determines whether ALU result is used, or whether a register needs to be read.