65ORG16.b Core

Topics relating to PALs, CPLDs, FPGAs, and other PLDs used for the support or creation of 65-family processors, both hardware and HDL.
User avatar
Dr Jefyll
Posts: 3526
Joined: 11 Dec 2009
Location: Ontario, Canada
Contact:

Post by Dr Jefyll »

Removing the possibility of a wasted cycle on (zp),Y address mode seems a worthy goal. After all, the penalty applies not only for page crossings but also on all (zp),Y write cycles -- page crossing or not!
Arlet wrote:
In order to accommodate external SDRAM better, I was thinking about ways to remove the dummy bus accesses from the core.
Whether or not the (zp),Y issue is dealt with, am I right in saying your core (like the 6502) still exhibits wasted cycles from various other causes? I bet many of them would be tough or impossible to eliminate. As an alternative remedy, maybe you could generate something akin to the 65816's VPA and VDA signals. Then at least an "external" device -- your SDRAM logic! -- would know enough to recognize a throw-away bus cycle, and not devote any unnecessary time to it. The wasted cycle would still exist, but it would run at full speed -- no wait states doing a futile fetch from the SDRAM. Does that make sense, or am I missing something?

Just a suggestion... Keep up the great work, fellas! :o

-- Jeff
User avatar
Arlet
Posts: 2353
Joined: 16 Nov 2010
Location: Gouda, The Netherlands
Contact:

Post by Arlet »

Dr Jefyll wrote:
Whether or not the (zp),Y issue is dealt with, am I right in saying your core (like the 6502) still exhibits wasted cycles from various other causes? I bet many of them would be tough or impossible to eliminate.
You are absolutely right. There will still be places where the core will have wasted cycles that currently result in a read on the bus. I don't think they can be avoided, at least not without a huge redesign. My plan was to add an 'OE' (output enable) signal to the core to distinguish a true read from a dummy read.

The problem is that in some places, the core doesn't know whether the cycle is valid or not. The "(zp), y" instruction, is one example. "abs, x" and relative branches are others. I'm hoping that all those speculative cycles can be removed, either by making a cycle non-optional, or by adding some extra logic in the address calculation, so the cycle can be removed. After that has been done, it should be possible to add the OE output. The SDRAM controller would then ignore the wasted reads like you said.

This would also benefit peripheral registers that might get confused by dummy reads to the wrong address.
User avatar
BigEd
Posts: 11463
Joined: 11 Dec 2008
Location: England
Contact:

Post by BigEd »

EEye:
very good to see the 'b' core coming into a final shape. I wonder if it's a good idea to update the head post? The current spec is a bit different from what you set out there. You now have
- the usual 65Org16 basis
- a new set of long-distance shift and rotate opcodes
- a set of 16 accumulators
- new capabilities for transfers and logical/arithmetic operations between accumulators

(I still think it would be better to have X, Y and SP inside the set of 16, rather than outside that set. Then - I think - you can do things like add 8 to SP and put the result in X, for some stack-relative lookup, in a single operation.)

Cheers
Ed

(Arlet: nice work on cutting out the dead cycles - that will help 6502 as well as 65org16 users of course!)
ElEctric_EyE
Posts: 3260
Joined: 02 Mar 2009
Location: OH, USA

Post by ElEctric_EyE »

BigEd wrote:
... (I still think it would be better to have X, Y and SP inside the set of 16, rather than outside that set. Then - I think - you can do things like add 8 to SP and put the result in X, for some stack-relative lookup, in a single operation.)...
Maybe in due time BigEd. Your insistance on it makes me curious! and your idea would be within the realms of the .b version IMO as well. I'll think on how to implement it. Thanks for your input. :)

The head post does need some tidying up, not too much though...

EDIT: I've got it up to date now, maybe too many changes, but most of it reflects current status of the 65Org16.b core as of 3/28/12....
ElEctric_EyE
Posts: 3260
Joined: 02 Mar 2009
Location: OH, USA

Post by ElEctric_EyE »

One more addition: INcrementing and DEcrementing an accumulator so it can act as a simple index register. This should be easy to implement. INA and DEA can reside in column $B and the rest of the opcodes can follow this RULE: For opcodes like LDx, TYx, TXx, INx, DEx where the destination reg is the accumulator and no need for accumulator/accumulator transposition, smallx=accumulator A thru Q, and this rule applies:

Code: Select all

16'b00dd_00dd_xxxx_xxxx 

IR[15:14] = 00 
IR[13:12] = dst_reg (A thru Q) 
IR[11:10] = 00 
IR[9:8] = dst_reg (A thru Q)
ElEctric_EyE
Posts: 3260
Joined: 02 Mar 2009
Location: OH, USA

Post by ElEctric_EyE »

BigEd wrote:
... (I still think it would be better to have X, Y and SP inside the set of 16, rather than outside that set...
Now I begin to see what you mean! I think it is starting to dawn on me. To be able to use all acumulators as a register like X or Y with their addressing modes. This would truly make it a powerful CPU! I think I can do it...
ElEctric_EyE
Posts: 3260
Joined: 02 Mar 2009
Location: OH, USA

Post by ElEctric_EyE »

I forgot about this:

Code: Select all

ABSX0  : regsel = index_y ? SEL_Y : SEL_X;
The X and Y registers are unique to regsel. Not so easy as I was thinking before I looked at the code again.

At this point, I think I can still do my idea of making a simple register out of the Accumulators. I should be able to test it out today...
User avatar
BigEd
Posts: 11463
Joined: 11 Dec 2008
Location: England
Contact:

Post by BigEd »

Hi EEye
I'm not quite sure if we're on the same page, yet.

I was thinking of the relatively simple idea, that by making X, Y and SP part of your 16-way set of accumulators (which is a very small change to the code) you can make these three registers the targets of your new two-operand instructions:
ADC #5 to B // you already have this
ADC #5 to X // you can't presently do this
ASL C by 3 to D // you already have this
ASL C by 3 to Y // you can't presently do this

For this case, there's no impact at all to the instruction encodings or to the decoding you have to do in the verilog.

It looks like you might be thinking of the more complex idea - which might be even more useful - of allowing some or any of the 16 accumulators to play the part of the X or Y registers in indexed addressing
JMP (location),C

I'm not thinking of this case. In this case you do have to think up some ideas about how to encode the choice of accumulator in the opcode.

Hope that helps.
Ed
User avatar
Arlet
Posts: 2353
Joined: 16 Nov 2010
Location: Gouda, The Netherlands
Contact:

Post by Arlet »

The complex idea isn't really that complex. If you replace this:

Code: Select all

ABSX0  : regsel = index_y ? SEL_Y : SEL_X;
by this:

Code: Select all

ABSX0  : regsel = index_reg;
Where 'index_reg' is a suitably defined reg, then it just becomes a matter of assigning the proper value to 'index_reg' during instruction decode.
User avatar
BigEd
Posts: 11463
Joined: 11 Dec 2008
Location: England
Contact:

Post by BigEd »

I suppose my main point is to try to establish that there are two ideas kicking around here - one means changing quite a few lines of source and finding some new encodings, and the other doesn't.

I agree that more flexibility with the index registers is desirable - in fact I think EEye has added that to his headline goals. (But I think it's a quite an extra step... maybe a .c core? It would be good to see a settled final version of this core and this thread!)

Ed
ElEctric_EyE
Posts: 3260
Joined: 02 Mar 2009
Location: OH, USA

Post by ElEctric_EyE »

Oh, I see where you're coming from now BigEd. There is one small problem with doing that though: The upper bits of the opcode, used for the src_reg or dst_reg, are full. For the <shift,rotate> these would be bits IR[11:8] for Acc's A through D. For the other functions like ADC,SBC, etc. these would be bits IR[15:8] for Acc's A through Q.

EDIT: IR[15:8] not IR[15:18]. typo
Last edited by ElEctric_EyE on Tue Apr 03, 2012 2:53 am, edited 1 time in total.
User avatar
BigEd
Posts: 11463
Joined: 11 Dec 2008
Location: England
Contact:

Post by BigEd »

In my simple plan, you don't need to change any bits, or need any extra bits. Three of the 16 accumulators become the X Y and SP. Instead of 16+3, as you have now, you just have 16. It's actually simpler (except for the assembler)

Acc15, for example, is the SP. You might choose also to call it Q.
ElEctric_EyE
Posts: 3260
Joined: 02 Mar 2009
Location: OH, USA

Post by ElEctric_EyE »

I thought you might suggest getting rid of 3 Accumulators!
I will keep the possibility in the back of my mind. Maybe when the final is done, which I would like to be after I add in the INcrement/DEcrement opcodes for the Acc's, someone could do this? I am finding myself busy, busy, busy! Work is picking up as well, so there is less free time.
User avatar
BigEd
Posts: 11463
Joined: 11 Dec 2008
Location: England
Contact:

Post by BigEd »

Understood. In principle, it only changes about 4 lines of code.
Cheers
Ed
ElEctric_EyE
Posts: 3260
Joined: 02 Mar 2009
Location: OH, USA

Post by ElEctric_EyE »

BigEd wrote:
... It would be good to see a settled final version of this core and this thread!)

Ed
Why is this your opinion?
Post Reply