65ORG16.c Core - 6502.org

6502.org Forum

Projects Code Documents Tools Forum

65ORG16.c Core

26 posts

1
2
Next

ElEctric_EyE: Posts: 3260; Joined: 02 Mar 2009; Location: OH, USA

65ORG16.c Core

Quote

Post by ElEctric_EyE » Fri Apr 06, 2012 2:21 am

In the end, I would like this .c core to have a 16-bit databus / 32-bit address bus just like the .b core. It will have 16 registers to start with, maybe more if max speed timing permits.

--Consider this post a placeholder-- as there is alot of things to consider and experiment with.

All registers/accumulators will have the ability to have full functionality of the Y Register as in the original NMOS6502. This is the most powerful and longest reaching due to indirect indexed Y mode. This mode will apply to all 16 Accumulators/Registers. And vice-versa, math/logic like ADC, SBC, EOR, etc. that was formerly done only on the accumulators, will be able to be performed on these registers/accumulators. So forget the difference between index registers and accumulators in this machine. Out with the old, in with the new, as they used to say. They are now one in the same. From now on I call them all Registers on this machine.

I alluded to this idea towards the end of the .b core thread. Arlet helped out here...
Also, there will be 16x16 multiplication opcodes.

BigEd: Posts: 11463; Joined: 11 Dec 2008; Location: England; Contact:
Contact BigEd

Website

Re: 65ORG16.c Core

Quote

Post by BigEd » Wed Apr 11, 2012 6:12 am

Hi EEye
I've been thinking about what I'd do, and I'd like to get my thoughts down. It would be great if the two of us could agree on this, although I know we differ at present.

I can see a way to have a regular instruction encoding and a simple extension of the assembly language syntax - very simple to explain and hopefully not too difficult to implement in an assembler. These are important points if we want a core which people will make use of.

I might take some presentation ideas from John West's 65020 document. Also, for register naming, I like Notch's idea from DCPU-16 of using letters which indicate the intended conventional use - even if all registers function the same way. (In that case, the v1.1 spec names A, B, C, I, J, X, Y, Z.)

So, my idea is:

16 registers, including A, X, Y and S
all can function as accumulators (for logic and arithmetic operations)
all can function as index registers
Perhaps name them A, B, C, D, E, F, G, H, I, J, K, V, W, X, Y, S
Long distance shift

The syntax of most instructions is very familiar, except A, X and Y can be replaced by any register letter

Code: Select all

LDA (zp),Y
LDB (zp),E
LDC (zp),W

For arithmetic we need a syntax extension to name the source and target register (which is always the same)

Code: Select all

ADC B abs,W
EOR C zp,X
AND J (zp,F)

Long distance shift needs an extension too:

Code: Select all

ASL #5 X
ROR #6 abs
LSR #7 abs,D

The inter-register transfers look like yours:

Code: Select all

TXA
TYB
TVW
TAS

As for the instruction encoding, it's simple: two 4-bit fields in the top of the instruction. One selects the destination register for those opcodes which need it. The other selects the index register for those opcodes which need it. In the case of shifts, the shift distance is in the register field except for register shifts. In the case of inter-register transfers, the source register is in the index field. That's it.

If we load all these features onto LDA, STA and TXA then we have the choice of freeing up existing opcodes for LDX, LDY, STX, STY and the other T opcodes. (I haven't thought that detail through!)

For the implementation I think it should be a matter of a 16-way mux directing the appropriate index register into play for each of the existing addressing modes, and a 2-way mux in the decoder. As we know, decode is not the critical path.

We gain a lot in simplicity and regularity. The resultant machine, and assembly code, still looks familiar to a 6502 assembly language programmer, and if they don't need all the registers they can use A, X and Y as before but they gain some more addressing modes. In fact a 4-register or 8-register version would make sense and just gives a few extra registers.

(Ah well, I didn't use John's presentation...)

What do you think?

Cheers
Ed

Last edited by BigEd on Wed Apr 11, 2012 6:34 am, edited 3 times in total.

Arlet: Posts: 2353; Joined: 16 Nov 2010; Location: Gouda, The Netherlands; Contact:
Contact Arlet

Website

Re: 65ORG16.c Core

Quote

Post by Arlet » Wed Apr 11, 2012 6:23 am

Hi Ed,

Looks good to me.

Quote:

For the implementation I think it should be a matter of a 16-way mux directing the appropriate index register into play for each of the existing addressing modes

You can even avoid this mux by adding a 'reg [3:0] index_reg'. During decode, you fill it with the appropriate value, and when it's time to access the index register, you only have to use that.

Maybe you'd also want a long distance shift by register, otherwise a variable shift turns into a very slow loop. This could even be a fixed register to save opcode space.

BigEd: Posts: 11463; Joined: 11 Dec 2008; Location: England; Contact:
Contact BigEd

Website

Re: 65ORG16.c Core

Quote

Post by BigEd » Thu Apr 12, 2012 6:26 am

Arlet wrote:

Maybe you'd also want a long distance shift by register, otherwise a variable shift turns into a very slow loop. This could even be a fixed register to save opcode space.

Yes, it would need to be a fixed register (in almost all cases) because otherwise we have 12 bits of extra information to fit into 8 bits. But I haven't yet seen the use-case for variable shifts.

I realise now that the burden of writing (or updating) an assembler is quite big. There's an explosion of mnemonics which one would want to match by regexp, but probably existing assemblers won't work that way. Also the extra operand constitutes quite a change from 6502. In the case of the baseline 65Org16 we've stayed very close. Even 65org32 will present a challenge to support the one-byte pointer, but hopefully that's quite an attractive machine for other reasons, so BitWise and teamtempest might be interested in supporting it. (Is this an attractive machine?)

The ideas from this machine can of course be taken forward to a 65Org32 variation: the extra 16 bits could supply a 15-bit signed operand (and a flag to indicate its presence) which would help with complaints about code density and memory bandwidth. Again, needs support from the assembler.

Cheers
Ed

teamtempest: Posts: 443; Joined: 08 Nov 2009; Location: Minnesota; Contact:
Contact teamtempest

Website

Re: 65ORG16.c Core

Quote

Post by teamtempest » Fri Apr 13, 2012 12:20 am

Quote:

But I haven't yet seen the use-case for variable shifts.

Once I wrote a terminal emulator for the C64 which featured a 64 x 32 character matrix. I used a hires bitmap and a charset made up of 5 x 6 pixels characters. Because the bitmap arrangement did not make it easy to plot 5 x 6 characters, I usually had to shift them by 1-7 pixels before plotting. IIRC I did that by a single shift instruction repeated in a loop counted down by the X-regsiter. (another way to do it, which I thought of much later, is to set up an indirect jump into a stack of seven shift instructions; the jump points at one and whatever follows it to the end of the stack)

So yeah, in general when plotting fixed characters at an arbitrary position in a bitmap a variable shift would be another way:

Code: Select all

LDA charset,X
LDB xpos
AND B, #%0111
LSR B
STA bitmap,Y

...is one thought to plot one row of a character bitmap. I suppose the contents of the dedicated register would be destroyed as part of a countdown of some sort? So maybe something like this:

Code: Select all

LDY #char_hgt
LDC xpos
AND C, #%0111
loop:
LDA charset,X
TCB
SRB ; if it's a dedicated register maybe a dedicated mnemonic ?
STA bitmap,Y
DEX
DEY
BPL loop:

But that's just an off-the-cuff thought.

This is kind of interesting:

Code: Select all

AND C, #%0111

because I find that fairly readable, but from an assembler standpoint it introduces an irregularity 'cause the first argument could be either a register or an expression. If it's an expression then the implied register is A. Hmm. Could you also write

Code: Select all

AND A, #%0111

if you wanted to be explicit? I don't see why not, and probably if I was implementing this I'd internally convert to this form before generating code (again off the top of my head).

Or you could tack the register onto the mnemonic directly:

Code: Select all

ANDC #%0111

or maybe:

Code: Select all

AND.C #%0111

would still be fairly easy to read, plus make the first argument always an expression (well, an expression with an address mode modifier). Kind of 68000-ish, but register names instead of memory sizes.

BigEd: Posts: 11463; Joined: 11 Dec 2008; Location: England; Contact:
Contact BigEd

Website

Re: 65ORG16.c Core

Quote

Post by BigEd » Fri Apr 13, 2012 4:12 am

Hi TT
I quite like the dot notation: it also solves the problem of ORA, which becomes

Code: Select all

OR.A
OR.B

and so on. Much like the 6502 case which allows for LSR and LSR A as two alternate forms, we can allow for a dotted or an undotted form in case anyone is a purist for three-letter mnemonics or a purist for always having a dotted register. I think the implicit form where AND means AND.A is too dangerous though. Doing without that removes the ambiguity it would introduce. (On the other hand, we've already solved LDx, STx and Txy so maybe ORx isn't a big deal.)

There's no need for the distance register to be modified by the shift (which is fixed-cost anyway, as we have a barrel shifter). We can standardise on D, and take your suggestion of bundling it into the opcode:

Code: Select all

SRD.x operand
SLD.x operand
RRD.x operand
RLD.x operand

as the four shift-by-distance operations. (This makes D special, which is a blow to compiler writers everywhere- they will have to avoid D, or avoid variable-distance shifts, or apply their ample ingenuity!)

Cheers
Ed

ElEctric_EyE: Posts: 3260; Joined: 02 Mar 2009; Location: OH, USA

Re: 65ORG16.c Core

Quote

Post by ElEctric_EyE » Fri Apr 13, 2012 9:43 am

teamtempest wrote:

...So maybe something like this:

Code: Select all

LDY #char_hgt
LDC xpos
AND C, #%0111 
loop:
LDA charset,X
TCB
SRB ; if it's a dedicated register maybe a dedicated mnemonic ?
STA bitmap,Y
DEX
DEY
BPL loop:

Currently in the .b version you can do something like this:

Code: Select all

LDY #char_hgt
LDC xpos
AND CopB, #%0111     ;AND C store it in B
loop:
LDA charset,X
;TCB
SR7     ; this can be a shift on A,B,C or D and stored in A,B,C or D
STA bitmap,Y
DEX
DEY
BPL loop:

One way to get that #$0111 value 'into' a shift opcode would be some self modifying code. Hard to do in an assembler?
also, should the .c version support transpositional operations?

65Org16:https://github.com/ElEctric-EyE/verilog-6502

BigEd: Posts: 11463; Joined: 11 Dec 2008; Location: England; Contact:
Contact BigEd

Website

Re: 65ORG16.c Core

Quote

Post by BigEd » Fri Apr 13, 2012 12:28 pm

Hi EEye
The way my thinking developed, there was no room for the encoding of transpositional operators - it feels more desirable and certainly more 6502-like to me to use the 4 bit field to specify an index register or a shift distance.

Similarly, it feels better to me not to have the restriction of only 4 registers being able to take part in shift operations. Of course, it helps with encoding density, so it is a judgement call. I have really mixed feelings about nominating one register for the shift distance. I suppose we do have enough bits for all cases except for indexed addressing modes.

The biggest bang for buck though, to make an attractive core and get interest and adoption, is probably other things lacking in the base core (multiply, phx and phy, bsr) rather than these 16-register extensions... I should probably be thinking about that.

Cheers
Ed

Arlet: Posts: 2353; Joined: 16 Nov 2010; Location: Gouda, The Netherlands; Contact:
Contact Arlet

Website

Re: 65ORG16.c Core

Quote

Post by Arlet » Fri Apr 13, 2012 5:29 pm

Without register/register ALU operations, I think that 16 registers is probably already more than most programs will effectively use. Reducing it to 8 will free up a bit (or two) in the opcode space, without sacrificing too much.

BigEd: Posts: 11463; Joined: 11 Dec 2008; Location: England; Contact:
Contact BigEd

Website

Re: 65ORG16.c Core

Quote

Post by BigEd » Fri Apr 13, 2012 6:07 pm

Yes, the quickest fix might be 8 registers total, 4 of which can be indexes. (I'm not certain that works for all cases.) The other easy place to save bits is restrict shift distances to 4 choices: 1,2,4,8.

Cheers
Ed

teamtempest: Posts: 443; Joined: 08 Nov 2009; Location: Minnesota; Contact:
Contact teamtempest

Website

Re: 65ORG16.c Core

Quote

Post by teamtempest » Fri Apr 13, 2012 10:58 pm

Quote:

One way to get that #$0111 value 'into' a shift opcode would be some self modifying code. Hard to do in an assembler?

Aw, all the best assemblers have a way to do that, unless perhaps the underlying cpu doesn't like that sort of thing.

But the code sample (such as it is) isn't really interested in the particular value '%0111'. That's just a mask to get the low three bits of the current X-position. Presumably that's the left edge of the character position (or more generally one edge of some rectangular bitmap that's going to plotted somewhere in a larger bitmap). The low three bits will vary if arbitrary pixel positioning is allowed, hence the need for some way to shift the smaller bitmap by a variable amount.

There are lots of software ways to do that, but a hardware shift in constant time is attractive.

teamtempest: Posts: 443; Joined: 08 Nov 2009; Location: Minnesota; Contact:
Contact teamtempest

Website

Re: 65ORG16.c Core

Quote

Post by teamtempest » Fri Apr 13, 2012 11:18 pm

Quote:

Without register/register ALU operations, I think that 16 registers is probably already more than most programs will effectively use. Reducing it to 8 will free up a bit (or two) in the opcode space, without sacrificing too much.

Quote:

Yes, the quickest fix might be 8 registers total, 4 of which can be indexes. (I'm not certain that works for all cases.)

Actually I have been wondering how I'd use all those registers. I'm not used to so many though, so maybe my imagination is limited. Still, that 64K "zero page" is an awful lot of "fast" registers already.

teamtempest: Posts: 443; Joined: 08 Nov 2009; Location: Minnesota; Contact:
Contact teamtempest

Website

Re: 65ORG16.c Core

Quote

Post by teamtempest » Fri Apr 13, 2012 11:21 pm

Quote:

The biggest bang for buck though, to make an attractive core and get interest and adoption, is probably other things lacking in the base core (multiply, phx and phy, bsr) rather than these 16-register extensions... I should probably be thinking about that.

Not to be too picky, but shouldn't there also be PHB and PLB, PHC and PLC, etc? I'm all for BSR (and maybe BRA, although maybe that should be re-named)!

teamtempest: Posts: 443; Joined: 08 Nov 2009; Location: Minnesota; Contact:
Contact teamtempest

Website

Re: 65ORG16.c Core

Quote

Post by teamtempest » Fri Apr 13, 2012 11:58 pm

Quote:

I quite like the dot notation

Thanks! It also has the advantage, at least for the HXA assemblers, of already being supported for macro names. I'd imagine a family would go something like this (assuming all registers can be used as either accumulators or indices):

Code: Select all

.macro ADC.A, ?expr=@,?ndx=@
_do_adc $0, "?expr", "?ndx"
.endm

.macro ADC.B ?expr=@,?ndx=@
_do_adc $1, "?expr", ?ndx"
.endm

...more in this family...

.macro ADC.Y ?expr=@,?ndx=@
_do_adc $F, "?expr", "?ndx"
.endm

and then a "do-the-real-work" macro:

Code: Select all

.macro _do_adc ]bits, ]expr$, ]ndx$
.if ]expr$ == "@"
_bad_expr
.endif
...other error checks...
.if ]ndx$=="@"
_check_abs_zpg
.else if ]ndx$ ~ /^[ABCDEFGHIJKLMNXY]$/i
_do_abs_ndx
.else
...other cases...
.endif
.endm

That's just an outline of one way to do it, of course.

The alternative notation

Code: Select all

ADC B expr[,ndx]

is a bit trickier for HXA, I think, not least because HXA splits anything after the mnemonic (or macro name or pseudo op) based on commas. So "B expr" would be passed to any macro as a single item. Hmm. The relevant part of a macro definition might look like:

Code: Select all

.macro ADC ?expr=@, ?ndx=@
...
.if ]expr$ ~ /^[A-NXY][ \t]/
]reg$ = mid$(]expr$, 1, 1)
]expr$ = mid$(]expr$, 3)
.else
]reg$ = "A"
.endif
...
.endm

Ah, maybe it wouldn't be so bad after all as far as implemention went. Actually because that test would be common to many mnemonics I'd probably break it out as a nested macro (the "]name$"s are variable string labels with global scope, so fiddling with any of them in a nested macro affects their subsequent value in any other macro).

Also I assumed the space after the register designator could be a tab.

I still think the ".reg" notation is easier to read. But that's one of the advantages of playing around with various notations via macros. Tells you something about how easy they'd be to read, write, and implement.

ElEctric_EyE: Posts: 3260; Joined: 02 Mar 2009; Location: OH, USA

Re: 65ORG16.c Core

Quote

Post by ElEctric_EyE » Sat Apr 14, 2012 12:30 am

teamtempest wrote:

...Still, that 64K "zero page" is an awful lot of "fast" registers already.

Ever since I got into FPGA's I've wondered about this. It was known that programs that could fit in 'zero page' ran faster. I'm not sure this applies to 6502 Cores within FPGAs because the delays in the higher addresses were due to delays in the silicon... Maybe someone more knowlegdeable can confirm or deny this?

65Org16:https://github.com/ElEctric-EyE/verilog-6502

Post Reply

26 posts

1
2
Next

Return to “Programmable Logic”