65Org16.d Core

ElEctric_EyE · Post by **ElEctric_EyE** » Tue Feb 05, 2013 2:01 pm

I had started the thread about the .c core too early, even before the .b core was completed, and it strayed into dreamland. Now that the .b core has been completed, I have been mulling over a couple of functions I would like to try to add to the .b core in the near future, thereby making it the .d core when completed.

1) is to have 256 opcodes for multiplying any of the 16 accumulators, and add 2 more 16-bit wide registers dedicated for storing the result, 1 reg for the "MSB" & 1 reg for "LSB". More opcodes would be added so that the X,Y & W registers could be multiplied together or with the accumulators as well.

2) is to have the ability to transfer values from the result registers to any accumulator or register.

3) is to be able to swap values between any accumulator or register.

teamtempest · Post by **teamtempest** » Thu Feb 07, 2013 12:10 am

Quote:

1) is to have 256 opcodes for multiplying any of the 16 accumulators, and add 2 more 16-bit wide registers dedicated for storing the result

You're the one who's got to make this work, of course, so it's your call, but isn't this perhaps straying a bit too far? Using the contents of the accumulator(s, in the case of the 65Org16b) as an operand tends to replace that operand with the result for most such instructions. Why not store the result in the two operand accumulators?

Wouldn't work for multiplying an accumulator by itself, naturally. But it's only a couple of cycles to transfer the contents to another accumulator if you really want to square a value.

ElEctric_EyE · Post by **ElEctric_EyE** » Thu Feb 07, 2013 12:58 am

Hi TT,
My intent would be to at any time do a specific multiplication opcode. Whatever values happen to be in the accumulators, that's what gets multiplied and the full 32-bit result to be put inside the 2 special result reg's...
The ALU takes 2 16-bit wide databus inputs AI and BI along with a 4 bit function input. I believe I need to redefine 1 'op' for multiplication to the ALU 4 bit function input, while putting the appropriate Accumulator/Register data onto the AI & BI buses going into the ALU.

Even though I had said 256 opcodes previously, I figure 136 opcodes to be useful and non-redundant for acc X acc multiplication. Although, all the opcode OPs will wind up being pre-defined in the Verilog.

Dr Jefyll · Post by **Dr Jefyll** » Thu Feb 07, 2013 4:54 am

ElEctric_EyE wrote:

Whatever values happen to be in the accumulators, that's what gets multiplied and the full 32-bit result to be put inside the 2 special result reg's...

Is it possible to avoid adding any special registers? If I'm reading it properly, TT's suggestion is that the multiply instruction would use the two 16-bit accumulators as both source & destination. Ie, the original contents are read as two 16-bit values, and those original contents are replaced by the 32-bit result (stored in 2 pieces).

The 6809 is an example of a chip that uses this sort of approach. One of the advantages of not adding a new, special register is that you avoid the need to create a new instruction to retrieve the result from that register. But maybe the new instruction is made worthwhile by other considerations.The call is yours -- it's your baby!

cheers
Jeff

ElEctric_EyE · Post by **ElEctric_EyE** » Thu Feb 07, 2013 11:24 am

Dr Jefyll wrote:

...One of the advantages of not adding a new, special register is that you avoid the need to create a new instruction to retrieve the result...

I see your points. No need to add more registers, I can use the last 2 accumulators, O & Q for the result. I'm close to having a test suite completed, written in Verilog in the PVB Concept thread. Besides ISE, the only other tools needed is an assembler to create the binary, and Arlet's bin2hex to create a file specified in line 11 of the SYSROM.v. Then one can run a simulation.

ElEctric_EyE · Post by **ElEctric_EyE** » Tue Feb 12, 2013 3:19 pm

I'm going to take BigEd's lead on the multiply function (I couldn't find the thread here) and try to work with his code shared on Github. He has an OUTHI for the MSB of the product which I would like to use, but implements the multiply with an extra signal. I would like to try to implement it through the already present '[3:0] op' input. It doesn't look too difficult. I think the only tricky part is defining a new 'state'.

ElEctric_EyE · Post by **ElEctric_EyE** » Tue Feb 12, 2013 4:17 pm

ElEctric_EyE wrote:

... It doesn't look too difficult...

Hastily said I think, but here goes.

The concept is for MUL [A..N] X [A..N]. The CPU stores [A..N,X,Y] in AI, [A..N,X,W] in BI. AI & BI go to the ALU. Next the ALU sends back the Result LSB through [15:0] OUT, and MSB through [15:0] OUTHI. Then data from OUT & OUTHI are transferred to the O & Q accumulators.

MUL Acc1,Acc2 Opcode definition:

Code: Select all

(From the microcode state machine->)16'bxxxx_xxxx_xxxx_0011:  state <= MUL;	 // column 3, multiply

[15:12] Define Acc1 (or Reg)
0000 A accumulator
0001 B accumulator
....
1101 N accumulator
1110 X register
1111 Y register
-----------------------
[11:8] Define Acc2 (or Reg)
0000 A accumulator
0001 B accumulator
....
1101 N accumulator
1110 X register
1111 W register
-----------------------
[7:4] -future-
xxxx don't care
-----------------------
[3:0] all of column 3 is free in the opcode matrix
0011

ElEctric_EyE · Post by **ElEctric_EyE** » Tue Apr 30, 2013 2:11 am

Going to put the multiplier on the backburner for now, not because it's difficult but the function is not needed at this point...

I'm going to go off on a tangent here so bear with me please: Instead of an 8 bit I/O data port like what is present on the 6510, I would like to attempt adding a 16-bit flag I/O port. This I/O port would be in addition to the 'P' register which has the N, V, C, Z flags.

The lower 8 bits of this new flag register would be the output bits and would have special set and clear opcodes for each of the 8 bits, much like SEC and CLC opcodes for the Carry flag. This is good for communicating in a fast manner to external devices, using programmable flags. For a 1MHz device, one could just do a LDA/STA IO port, but a 100MHz system demands accountability of every cycle! So the special 1 cycle opcodes become useful, especially in a video environment.

The upper 8 input bits would have 2 branch instructions associated with each bit, i.e. a branch on set and branch on clear.

ElEctric_EyE · Post by **ElEctric_EyE** » Tue Apr 30, 2013 5:03 pm

I decided instead to use the upper 8 unused bits in the P processor status register. So far I have the 4 output control bits working in ISim, which I can set or clear with 8 additional opcodes. Next I'll work on the 4 input bits which can be tested by 8 additional branch opcocdes (hopefully!).

@BigEd, how do I correctly go about forking to start the .d core?

BigEd · Post by **BigEd** » Tue Apr 30, 2013 6:02 pm

hmm, I'm not sure if github allows a single user account to create multiple forks of the same upstream project - the normal way to approach this is a branch, which is like a fork but it stays inside the same git project.

ElEctric_EyE · Post by **ElEctric_EyE** » Tue Apr 30, 2013 6:23 pm

Branch works fine, thank you.

BigEd · Post by **BigEd** » Tue Apr 30, 2013 6:35 pm

Great!

ElEctric_EyE · Post by **ElEctric_EyE** » Wed May 01, 2013 3:38 pm

Trying to do the branch opcodes turned out to be too difficult. Although I did add the 4-bit input port. Still I am not satisfied with this.

I'm thinking now to get rid of the Input port in the status register, I'll keep the 4-bit output port. Instead have a mode bit [8] that when clear, the lower 8 bits of the status register act as normal 6502. When bit [8] is set, the ALU is bypassed and the C, Z, V & N flag bits come from 4 inputs. That way I can take advantage of the branching instructions to control what's happening outside the cpu.

Arlet · Post by **Arlet** » Wed May 01, 2013 3:44 pm

The branches shouldn't be too difficult. When you see a branch opcode, go the BRA0 state, and make sure 'cond_true' is set depending on the bit you want to look at.

ElEctric_EyE · Post by **ElEctric_EyE** » Wed May 01, 2013 4:20 pm

Oh I think I understand now! The IR[7:5] was throwing me before, but now I see it. All the work I did is not wasted.

I think I can define the 8 new branch opcodes using IR [8:5].

65Org16.d Core

65Org16.d Core

Re: 65Org16.d Core

Re: 65Org16.d Core

Re: 65Org16.d Core

Re: 65Org16.d Core

Re: 65Org16.d Core

Re: 65Org16.d Core

Re: 65Org16.d Core

Re: 65Org16.d Core

Re: 65Org16.d Core

Re: 65Org16.d Core

Re: 65Org16.d Core

Re: 65Org16.d Core

Re: 65Org16.d Core

Re: 65Org16.d Core