Page 17 of 24

Re: 65ORG16.b Core

Posted: Sat Apr 21, 2012 5:41 am
by BigEd
SEP and REP is just about worthwhile, although if the bits in the status register are rarely enough changed then you don't need a high-performance way to change them, and fiddling with PHP and PLP with some more macros would be adequate. SEP and REP is extra work.

I really wouldn't make these two special registers subject to mode bits though, or take them from the arithmetic/logical register file. The feature can be kept out of play by leaving the registers at default values of 0 and 1. There's no reason at all why one might need high-performance arithmetic or logical operations on these special registers, and by using the general register pool all that's happened is a shrinking of the pool. You'd be compounding that mistake by adding a mode bit to disable it. As Arlet pointed out earlier, these special registers are accessed in parallel to the register file, so to his thinking and my thinking they should be separate.

Finally, if you do it the right way, the idea is applicable to the original core, and to an 8-register version too. It becomes a feature which is not tied in to the 16-register variation. It's independent. That's the best way to add features!

Cheers
Ed

Re: 65ORG16.b Core

Posted: Sat Apr 21, 2012 6:17 am
by Arlet
ElEctric_EyE wrote:
So I can probably make this work (with Arlet's watchful eye), as I feel I am learning a new language here and I've not heard his condemnation! :lol: :wink: Yet...
That's because I was sleeping. :)

Your design works, but it requires a triple ported RAM for the register file, while the original design only had 1 read port. Of course, the simulator isn't going to care, but the amount of logic will go up, and speed will probably go down (if not now, it may still cause an extra loss in speed later when you add more stuff). Now, triple ported RAM has its uses, but since these are kind of special registers, it would be preferable to keep them in special flip flops. They'll be faster, too, because by cutting them loose from the register file, the router is also free to move them around on the FPGA to optimize performance (probably close to the address bus).

I also agree with Ed that you won't be using these registers as regular accumulators, so they don't need a full set of operations. A special opcode to set them once or twice would be good enough. You could even make them memory mapped, although that would provide another challenge of finding a suitable memory area.

Re: 65ORG16.b Core

Posted: Sat Apr 21, 2012 11:37 am
by ElEctric_EyE
Well my intent wasn't to do anything complicated with the PAGE pointers by using a(n) accumulator(s). My thought was that once the bit disabled data going from the accumulator to the PAGE register, it could be used as a regular accumulator again. But I just ran a speed test and the design no longer fits under an 11ns constraint. That is intolerable IMO. So I will work on something simple that runs outside the main register file, and an opcode to load data into the PAGE pointers. Another opcode to read/modify/write data into the PAGE pointers.

EDIT: instead of a read/modify/write like INC, which would mean I'd need DEC, I think maybe a transfer to pointer would be useful.

Re: 65ORG16.b Core

Posted: Sat Apr 21, 2012 12:35 pm
by BigEd
Can I suggest you consider an exchange instruction? Then you only need one. I did this for my B register which I used for my multiply instruction. It was pretty simple - you can probably just lift the code.
Cheers
Ed

Re: 65ORG16.b Core

Posted: Sat Apr 21, 2012 12:47 pm
by BigEd
By the way, very interesting that the speed did drop off! We might have been wrong about how the synthesis would have implemented your approach...

Re: 65ORG16.b Core

Posted: Sat Apr 21, 2012 4:04 pm
by ElEctric_EyE
BigEd wrote:
Can I suggest you consider an exchange instruction? Then you only need one. I did this for my B register which I used for my multiply instruction. It was pretty simple - you can probably just lift the code.
Cheers
Ed
Ill experiment with your code. Not making much progress today though. Work is busy unfortunately/fortunately! 1 instruction per register, so 2 total.

Re: 65ORG16.b Core

Posted: Sat Apr 21, 2012 8:31 pm
by ElEctric_EyE
Ok, I got it working, but you have something similar to this in your code BigEd :

Code: Select all

always @*    
	case( state )	
		DECODE: write_streg <= load_streg;       
		default: write_streg <= 0;    
	endcase
and then this:

Code: Select all

always @(posedge clk )     
	if( write_streg & RDY )        
	STACKPAGEReg <= DIMUX;
Using this alone appears to work:

Code: Select all

always @(posedge clk )     
	if( load_streg & RDY )        
	STACKPAGEReg <= DIMUX;

Re: 65ORG16.b Core

Posted: Sat Apr 21, 2012 9:19 pm
by MichaelA
EE:

I've been folllowing your discussions on this thread for a while, but I've not delved into any of the code. However, in your last post/reply, I think that you've reduced the code beyond what BigEd appeared to intend. That is, the following code fragment is more in the spirit that BigEd appears to intend:

Code: Select all

assign WE = (state == DECODE) & load_streg & RDY;

always @(posedge clk)
begin
     if(WE)
          STACKPAGEReg <= DIMUX;
end
Your implementation of the always block does not appear to consider that BigEd's write_streg is only asserted when (state == DECODE) and load_streg is asserted. In your implementation, this additional qualification of the write enable of the STACKPAGEReg may not be required, and if so, please ignore this reply. You are correct, however, that BigEd's code may have been more legibly expressed as:

Code: Select all

always @(*) write_streg <= (state == DECODE) & load_streg;
In this manner, no case statement with a default expression is required.

Re: 65ORG16.b Core

Posted: Sat Apr 21, 2012 9:42 pm
by ElEctric_EyE
Hi Michael, you're correct I said it worked without some more vigorous testing, I was about to delete the post... But thanks for replying! In this register, I don't care if it's a read or write. I just need the data following the opcodes that trigger load_streg and load_zpreg to go into the registers. I do this because the opcodess are basically just a transfer instruction. Still working on it.

Re: 65ORG16.b Core

Posted: Sat Apr 21, 2012 10:21 pm
by BigEd
I confess that my B and D reg code is just copied from the nearby regfile code without much thought - it is possible that it can be simplified. It does look like load_b_reg is in fact only valid at decode and rdy.

Note that the code might need to take the existing form if instead of two or more zero-operand opcodes we had a one-operand opcode as I suggested at one point, picking up the special register address from the data bus in a later cycle. I'm not certain of this. I find it all rather difficult!

Edit: hmm, that might not make sense, in which case I apologise for posting while tired. I've got XBA and TXD which are both zero operand. XSR #reg would be one operand.

Cheers
Ed

Re: 65ORG16.b Core

Posted: Sat Apr 21, 2012 11:44 pm
by ElEctric_EyE
Ed you are onto why I had my problem originally.
BigEd wrote:
...Note that the code might need to take the existing form if instead of two or more zero-operand opcodes we had a one-operand opcode as I suggested at one point, picking up the special register address from the data bus in a later cycle. I'm not certain of this. I find it all rather difficult!
I believe you had the same problem I initially had... I had defined the new opcodes in the state machine for REG only. Now I have those opcodes defined for REG and FETCH, which is used for opcodes in immediate mode, like LDA #$xxxx, etc. Now I have something that works! But alot more has changed as well. First for simulation purposes, I had to initialize load_zpreg and load_streg to 0. Some changes to your code (only changes shown):

Code: Select all

always @*    
	case( state )	
		FETCH: write_zpreg <= load_zpreg;       
		default: write_zpreg <= 0;    
	endcase
	
always @*    
	case( state )	
		FETCH: write_streg <= load_streg;       
		default: write_streg <= 0;    
	endcase	
Then this code to write to the registers:

Code: Select all

always @(posedge clk )     
	if( write_zpreg & RDY )        
	ZEROPAGEReg <= DIMUX;

always @(posedge clk )     
	if( write_streg & RDY )        
	STACKPAGEReg <= DIMUX;
I will do more testing, but intial testing is proving OK!

Re: 65ORG16.b Core

Posted: Sun Apr 22, 2012 12:58 am
by MichaelA
Ed:

I appreciate you linking your code for me in your last post. I had this nice reply in the works and then I decided to click through to your code for a second look, and when I got back, my tome was lost. So I will try and reconstruct it as best I can.

Your code is very clean, and clearly follows the structure and style of Arlet Ottens. Like most members of this forum, I am also working on an FPGA implementation of the WDC 65C02. I have currently completed one one core and am working on my second. My general objectives are a faithfull reproduction of the WDC 65C02 instruction set, but not necessarily the instruction cycle time. Actually, my objective in that regard is to reduce the number of clock cycles required for most instructions without resorting to using a wider data bus. My approach to the implementation of the core also differs from those that others such as yourself, and Arlet Ottens have posted. That is, I use a microprogrammed instruction sequencer.

My first core was intended to be used with a single cycle asynchronous memory such as can be synthesized from Xilinx LUTs. It assumes that the address is output and the data is returned on the same cycle. At the performance that I was targeting, 100 MHz in a Xilinx XC3S200AN-5, this approach is not particularly practical since any reasonable program would require too much of the free LUTs and would not provide a sufficiently large memory for anything but toy applications. My second core is a more complete implementation with BRAM included in the core. The difference in the access method has caused me to rethink some of the asynchronous logic that I used for single cycle address calculation in the first core. Without considering the delays in the address output path or the input data path delays (as is the case in functional simulation), my first attempt is complete. But it will never work in a practical system. Thus, the second core has been derived from the first core, and it deals with these practical considerations. With respect to the second core, I have completed its re-implementation and reworked the microprogram's control fields, but I've not yet begun to debug the microprogram. Given the amount of time that I have available to persue this hobby, its going to be several weeks before that task can be completed.

Back to your core. You and Arlet are using a one-hot state machine methodology for the instruction sequencer. I've not been much of a fan of the one-hot state machine approach for a number of reasons, although I often use one-hot control fields. However, the performance and resource utilization that Arlet achieved (as you detailed in an earlier post comparing various cores), coupled with the cleanliness of the implementation, has convinced me to put some effort to studying the methodology once I've completed my second core. It appears that the base design uses the register file to provide the AI input to the ALU module. I have used the LUT RAMs in the Xilinx FPGA in this manner for many years. The inherent multiplexer of the RAM is the fastest way to improve the operating speed of a Xilinx FPGA.

It also appears that the register file is implemented as a single port RAM and that the core does not use multiple, independent adders to provide parallel computation of addresses and/or ALU results. Therefore, I am going to suggest expanding the register file so that you can also use it for temporary storage instead of wiping out your S storage location. Since for the Xilinx FPGA that you are using, any use of a LUT as a single or dual port RAM will always make available a minimum of 16 "registers", the synthesizer is currently tying off the two (16) or four (64) most significant address lines. (I am sure that you are aware that Xilinx FPGAs older than the Spartan 6 and Virtex 7 families employ 4 input LUTs, while these two FPGA families employ 6 input LUTs.) In this manner, you can place your B register into the register file, and any other registers or temporary values that you be need. I don't think that this suggestion applies to the shift count/direction register, but I've not spent any time exploring the implementation or the instruction set that you are presently implementing with your core.

Once again, thanks for the link to your code.

BTW, you need not modify the present definition of the regfile address/select variable: "regsel". You can simply extend it using a bit vector construction, {extreg, regsel}, where "extreg" is declared as a 2-bit or a 4-bit vector (set by the FPGA family that you are using). If you fail to include the expanded RAM select vector when addressing the register file, then verilog's default behavior will left fill the RAM address with 0s, and the register file will behave like your current implementation: a four location RAM. (A synthesizer or PAR warning should be issued, and it will indicate that the RAM address is zero filled because its not completely specified.)

Re: 65ORG16.b Core

Posted: Sun Apr 22, 2012 1:50 am
by ElEctric_EyE
What exactly is your contribution to the subject matter at hand? I don't follow? Not 100% OT but it's headed there.

Maybe you should start a new thread here in the Programmable Logic section on your endeavours.

What I mean to say is, if you are some kind of expert, then let's see your projects then, some evidence or proof, some kind of project, which we people always demand.

Re: 65ORG16.b Core

Posted: Sun Apr 22, 2012 1:55 am
by MichaelA
Sorry about that.

Re: 65ORG16.b Core

Posted: Sun Apr 22, 2012 1:58 am
by ElEctric_EyE
Listen, I don't mean to get all negative, but we were trying to achieve a common goal of a relocatable zero page and stack for the 65Org16... We do welcome experts!! Maybe post here one of your Verilog projects?

Or even better, post your intro in the Introduce Yourself, under the General Discussion section, then at least we can all know where you're coming from. BTW welcome Alabama!