65ORG16.b Core

Topics relating to PALs, CPLDs, FPGAs, and other PLDs used for the support or creation of 65-family processors, both hardware and HDL.
ElEctric_EyE
Posts: 3260
Joined: 02 Mar 2009
Location: OH, USA

Post by ElEctric_EyE »

I redid the opcodes for all the TAB,TAC, etc. opcodes so now they all have an opcode space $0000_xxxx_1000_1011 to match bits [11:8] for src&dst_reg bits in the 'transposing stores' opcodes.
Just need to adjust some Macro's and I can test & post.
ElEctric_EyE
Posts: 3260
Joined: 02 Mar 2009
Location: OH, USA

Post by ElEctric_EyE »

Using the previous program I posted, the .b core appears to be working correctly. Arlet, thank you for your guidance! I like the challenge of learning Verilog and appreciate your help.

I will do some more tests and can post a waveform tomorrow captured from a widescreen display to get more data from a different program. I will update my Github posting after I add more complete comments.

Max Speed is 95.1MHz!
User avatar
BigEd
Posts: 11463
Joined: 11 Dec 2008
Location: England
Contact:

Post by BigEd »

Great stuff - I'm watching with interest!
ElEctric_EyE
Posts: 3260
Joined: 02 Mar 2009
Location: OH, USA

Post by ElEctric_EyE »

Thanks! I am excited. Here is the latest update to cpu.v.

To summarize progress so far. These opcodes are in addition to the original NMOS6502:

3 Additional fully functional Accumulators with same addressing modes as the original A Acc
12 new opcodes for transferring data among all 4 Acc's
12 new opcodes for transferring data among X and Y registers to 3 new Acc's
6 new opcodes to PUSH/PULL 3 new Acc's to/from the 64K stack
??? new opcodes for all addressing modes of ADC,SBC,AND,OR,EOR on each new Acc with ability to store the result in any Acc
??? new opcodes for all addressing modes of ASL,ROL,LSR,ROR,BIT on each new Acc
User avatar
Arlet
Posts: 2353
Joined: 16 Nov 2010
Location: Gouda, The Netherlands
Contact:

Post by Arlet »

ElEctric_EyE wrote:
??? new opcodes for all addressing modes of ADC,SBC,AND,OR,EOR on each new Acc with ability to store the result in any Acc
??? new opcodes for all addressing modes of ASL,ROL,LSR,ROR,BIT on each new Acc
How about extending the first mechanism to cover the shift/rotate instructions ? So, for example, perform ASL A, and put the result in B.

I think it would actually simplify instruction decoding.
ElEctric_EyE
Posts: 3260
Joined: 02 Mar 2009
Location: OH, USA

Post by ElEctric_EyE »

Sounds good to me. Still testing...

I found an error with the T[A..D]X & T[A..D]Y. Fixed that and updated.

I am noticing in Simulation, that during a Transfer from X register to Acc, the IR lasts 2 cycles (_REG, DECODE) and other times it lasts 1 cycle (DECODE). Is this normal depending on which opcodes are before it? The values being transferred are good, just that cycle discrepancy.
ElEctric_EyE
Posts: 3260
Joined: 02 Mar 2009
Location: OH, USA

Post by ElEctric_EyE »

Added and tested TXY($00BB) and TYX($00AB) opcodes!
ElEctric_EyE
Posts: 3260
Joined: 02 Mar 2009
Location: OH, USA

Post by ElEctric_EyE »

Proof is in the pudding. Here is the pudding!

Code: Select all

START             LDA #$5A        ;00A9,005A
                  CLC             ;0018
                  ADC #$01        ;0069,0001
                  TAX             ;00AA
                  INX             ;00E8
                  TXB             ;018A
                  LDDi            ;03A9
                  .BYTE $00A5     ;00A5
                  TDC             ;OB8B
                  TBX             ;01AA
                  TCX             ;02AA
                  TDX             ;03AA
                  LDX #$05        ;00A2,0005
                  TXY             ;00BB
                  TXA             ;008A
                  TXB             ;018A
                  TXC             ;028A
                  TXD             ;038A
                  CLC             ;0018
                  ADCDi           ;0F69        Add op to D Acc, store in D Acc
                  .BYTE $0001     ;0001
                  ADCDopAi        ;0369        Add op to D Acc, store in A Acc
                  .BYTE $000A     ;000A
                  ADCAopBi        ;0469        Add op to A Acc, store in B Acc
                  .BYTE $0001     ;0001
                  ADCBopCi        ;0969        Add op to B Acc, store in C Acc
                  .BYTE $0002     ;0002
                  TCY             ;02A8
                  TYX             ;00AB
                  NOP
                  NOP
                  NOP
                  NOP
                  NOP
                  NOP
Image
Image
ElEctric_EyE
Posts: 3260
Joined: 02 Mar 2009
Location: OH, USA

Post by ElEctric_EyE »

Arlet wrote:
...How about extending the first mechanism to cover the shift/rotate instructions ? So, for example, perform ASL A, and put the result in B.

I think it would actually simplify instruction decoding.
I would like to mod all the shifter opcodes first with 1-15 shifts spec'd by the most upper 4bits of their opcodes.
BigEd pointed out earlier this would require the use of a 16bit barrel shifter, and any number of shifts would take 1 cycle. This would be very useful for a 16bit CPU communicating with a <16bit peripheral, although for sure top speed would take a hit?
User avatar
Arlet
Posts: 2353
Joined: 16 Nov 2010
Location: Gouda, The Netherlands
Contact:

Post by Arlet »

ElEctric_EyE wrote:
I am noticing in Simulation, that during a Transfer from X register to Acc, the IR lasts 2 cycles (_REG, DECODE) and other times it lasts 1 cycle (DECODE). Is this normal depending on which opcodes are before it? The values being transferred are good, just that cycle discrepancy.
Can you show an example ? The IR is only valid during DECODE, but I would still expect all the transfer instructions to behave identically.
ElEctric_EyE
Posts: 3260
Joined: 02 Mar 2009
Location: OH, USA

Post by ElEctric_EyE »

For example above at 180ns & 550ns, there is $00AA(TAX) & $02A8(TCY) which is just 1 cycle. Then the transfer around 220ns, a $018A(TXB) takes 2 cycles.
In the above code, if a transfer comes after a transfer it takes 2 cycles. If a transfer comes after an ADC[A...D], it takes 1 cycle.
User avatar
Arlet
Posts: 2353
Joined: 16 Nov 2010
Location: Gouda, The Netherlands
Contact:

Post by Arlet »

Aha, yes, that's possible because you're looking at the state before the DECODE. In one case, that's a FETCH, in the other case, that's a REG. In the case of an ADC #$0001 instruction, it needs to read the immediate operand from memory, so the bus is busy doing that.

In case of a REG instruction, the memory bus isn't needed for one cycle, so it's basically doing a "don't care", which happens to be reading the next instruction. The result is discarded, and it's read again on the next cycle.

Keep in mind that the IR isn't really a register. It's basically a view of the DI bus, with some MUX logic inbetween. It only needs to contain the opcode during the DECODE state, and it's a "don't care" in any other state.
ElEctric_EyE
Posts: 3260
Joined: 02 Mar 2009
Location: OH, USA

Post by ElEctric_EyE »

One question: In SIM when one sees a 'don't care' taking up 1 cycle, similar to the one that makes a transfer opcode take 2 cycles, instead of 1, are we to understand these opcodes truly take 1 cycle in a real world app?

Also, today I added the ASL, ROL, LSR, ROR, BIT opcodes to the 'transposing stores', posted on GitHub.

I was looking at BigEd's use a D Reg to implement a barrel shifter. Looking at his code on GitHub he used a TXD function where a value in the X register was transferred to a newly spec'd D reg. This specified the number or shifts/rotates for the ALU.v.
Still looking how to mod this. Instead of 16bit wide register, I am thinking a 4bit wide register which is specified by the uppermost 4bits of the ASL/ROL/LSR/ROR opcodes. But it is basically implemented the same way through the ALU.
User avatar
Arlet
Posts: 2353
Joined: 16 Nov 2010
Location: Gouda, The Netherlands
Contact:

Post by Arlet »

In the real world, everything is exactly the same as in the simulation. So you'll see a dummy read cycle on the bus. A real 6502 also performs dummy read cycles (although not necessarily the exact same ones).

Opcodes like the TXA still take two cycles, but they only need to read one byte on the bus (to read the opcode).
ElEctric_EyE
Posts: 3260
Joined: 02 Mar 2009
Location: OH, USA

Post by ElEctric_EyE »

Hmm, I had always thought the cycles were clearly stated in the MOS Hardware manual. I probably didn't look close enough.

Using bits and pieces from BigEd's code to do variable shifts using a barrel shifter in the ALU, I have got some results that appear correct in 2 cycles!
More testing is needed but a regular ROL ($002A) and a 5x shift using a ROL($402A) outputted correct values to the A Acc. I had to add 1 to the value of the 4bit register, that gets its value from [15:12] of the <shift,rotate> opcodes, so it does at least 1 shift or rotate.

Max speed is still at 94.7MHz! Hard to believe that.

Thanks BigEd for that chunk of code in the ALU!

Code: Select all

START             LDA #$05
                  ROL A
                  ROL A
                  LDA #$05
                  .BYTE $402A
                  NOP
                  NOP
                  NOP
Image
Post Reply