6502.org • View topic - 65Org16 Assembler (16-bit bytes, 32-bit address space)

View unanswered posts | View active topics

Board index » 6502.org Users Forum » Programming

All times are UTC

65Org16 Assembler (16-bit bytes, 32-bit address space)

Page 1 of 9

[ 132 posts ]

Go to page 1, 2, 3, 4, 5 ... 9 Next

Previous topic | Next topic

Author

Message

BigEd

Post subject: 65Org16 Assembler (16-bit bytes, 32-bit address space)

Posted: Sat May 28, 2011 9:47 am

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10986
Location: England

This is a placeholder: I hope to catalogue some approaches and solutions to assembling code for the 65Org16 family.

Update: we have two fully featured assemblers! Thanks to teamtempest and BitWise!

Approaches available today

HXA assembler

supports the 65Org16

Andrew Jacob's

experimental pre-release

py65

My 6502js fork

basic assembler

Possible approaches in the forseeable future

HXA

family

Feel free to add suggestions and link back to previous discussions: I'll update this head post.

Cheers
Ed

Last edited by BigEd on Thu Nov 29, 2012 10:46 am, edited 7 times in total.

Top

teamtempest

Post subject:

Posted: Mon May 30, 2011 2:14 am

Joined: Sun Nov 08, 2009 1:56 am
Posts: 411
Location: Minnesota

Code:

 
 *= $FFFFE000                 ;START   COPY PATTERN $AA55 
                              ;FROM $10000000 TO $FFFEFFFF (($FFFE X 10000) + FFFF) 
                              ; 
FFFFE000   LDA #$0000         ;00A9 0000 
FFFFE002   STA $0000          ;0085 0000 
FFFFE004   LDA #$1000         ;00A9 1000 
FFFFE006   STA $0001          ;0085 0001 
FFFFE008   LDX #$FFFE         ;00A2 FFFE 
FFFFE00A   LDY #$0000         ;00A0 0000 
FFFFE00C   LDA #$AA55         ;00A9 AA55 
FFFFE00E   STA ($0000),Y      ;0091 0000 
FFFFE010   INY                ;00C8 
FFFFE011   BNE FFFFE00E       ;00D0 FFFB 
FFFFE013   INC $0001          ;00E6 0001 
FFFFE014   DEX                ;00CA 
FFFFE015   BNE FFFFE00E       ;00D0 FFF6 
FFFFE017   JMP FFFFE017       ;004C E017 FFFF 

I quoted this from another thread; trying to be "on topic" for this thread.

My immediate question regarding the above code is exactly what the physical arrangement of the bytes is. Let's take the last line:

Code:

FFFFE017 JMP FFFFE017 ;004C E017 FFFF

From the looks of it the 32-bit address is arranged as low-16/high-16. So far so good. But what about the...let's call them 8-bit nybbles, shall we?...within the 16-bit bytes? In other words, is the actual physical arrangment what is shown above, or is it perhaps:

Code:

4C 00 17 E0 FF FF

Or even something else?

Top

teamtempest

Post subject: Re: 65Org16 Assembler (16-bit bytes, 32-bit address space)

Posted: Mon May 30, 2011 2:42 am

Joined: Sun Nov 08, 2009 1:56 am
Posts: 411
Location: Minnesota

BigEd wrote:

Possible approaches in the forseeable future

Regarding the macro approach, one reason I mentioned the "HXA_T" version of my assembler is that it knows absolutely nothing about any instruction set. All it knows about processors are program counter width (8 to 32 bits) and byte orientation (MSB-first or LSB-first). The only way to get it to assemble an instruction set is via macros (some demos of this are included).

I use it as a test bed for everything HXA can do that doesn't rely on any particular instruction set (in a sense HXA65 is a specialization of HXA_T that fixes program counter width and byte orientation, then adds an instruction set).

However I am beginning to see that there is in HXA a bias that "bytes" are eight bits long. A quick and dirty approach to providing an assembler for a 16-bit "byte" machine would be to simply tell it that a 16-bit value has a size of one as far as the program counter is concerned (and 32-bit values have a size of two). This is fairly easy but modifies a part of the assembler that I had always thought of as fixed.

This might be sufficient, actually, if 16- and 32-bit values can be output strictly as what in 8-bit byte terms are considered LSB values. If 16-bit values are output as MSB and 32-bit values are output as low-16 MSB followed by high-16 MSB, there's a bit more re-arrangement that has to be done which HXA doesn't natively know how to do right now.

But the macro approach in this case would work because the re-arrangement could be done within the macro. For example (disregarding for the moment multiple address modes and how to account for the program counter thinking of "word" as size two instead of one):

Code:

.cpu T_32_M

.macro LDA, ?addr
.word $00A9
.word ?addr
.word ^(?addr)
.endm

The annoying part is that "?addr" has to be evaluated twice. In a strictly LSB-first world the macro would look something like this:

Code:

.cpu T_32_L

.macro LDA, ?addr
.word $00A9
.long ?addr
.endm

Top

GARTHWILSON

Post subject:

Posted: Mon May 30, 2011 4:12 am

Joined: Fri Aug 30, 2002 1:09 am
Posts: 8546
Location: Southern California

teamtempest wrote:

My immediate question regarding the above code is exactly what the physical arrangement of the bytes is. Let's take the last line:

Code:

FFFFE017 JMP FFFFE017 ;004C E017 FFFF

Code:

4C 00 17 E0 FF FF

Or even something else?

I think the only time it would matter is with 16-bit ROMs (as opposed to pairs of 8-bit ROMs which could go either way as long as you put them in the right sockets). I'd say 004C... instead of 4C00..., just as we have 4C in the 6502 and not C4 for the JMP.

Top

BigEd

Post subject:

Posted: Mon May 30, 2011 7:53 am

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10986
Location: England

Yes: there is no ordering of the bytes within a 16-bit word, until or unless you have to deal with shipping 8-bit bytes into a 16-bit wide memory. That wouldn't happen inside a purely 16-bit wide world. It will happen when loading a program over a serial link - but such formats explicitly come in big or little endian sub-formats. It also happens if using a pair of byte-wide ROMs, as Garth says. I don't think it should arise in the context of an assembler, because those half-words don't even have addresses. We must write them MSB to LSB - it would be perverse not to! The opcode for LDA immediate is $00A9. There is no assigned function for $A900.

Cheers
Ed

Top

BigEd

Post subject: Re: 65Org16 Assembler (16-bit bytes, 32-bit address space)

Posted: Mon May 30, 2011 7:57 am

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10986
Location: England

teamtempest wrote:

...I am beginning to see that there is in HXA a bias that "bytes" are eight bits long. A quick and dirty approach to providing an assembler for a 16-bit "byte" machine would be to simply tell it that a 16-bit value has a size of one as far as the program counter is concerned (and 32-bit values have a size of two). This is fairly easy but modifies a part of the assembler that I had always thought of as fixed.

This sounds promising to me! It's more or less what I did with Dave B's assembler. I didn't have to modify much at all.

I don't fully follow your examples, but it seems worthwhile to avoid evaluating parameters twice. You mention "strictly LSB-first world" - as we're inheriting strongly from 6502, I think that is the world we're in.

Cheers
Ed

Top

teamtempest

Post subject: Re: 65Org16 Assembler (16-bit bytes, 32-bit address space)

Posted: Thu Jun 02, 2011 10:49 pm

Joined: Sun Nov 08, 2009 1:56 am
Posts: 411
Location: Minnesota

BigEd wrote:

teamtempest wrote:

Sorry; phone troubles have left me without an Internet connection for several days.

As far as the example macros go, they're in HXA's language. The HXA_T variant accept names of the form "T_PC_B", where "PC" is the number of bits of the program counter and "B" is either "M" for most-significant-byte first or "L" for "least-significant-byte first". "PC" sets address limits, and "B" determines how multi-byte values are written to output files.

So given, say, "LDA 89ABCDEF", the "T_32_M" version would output this sequence:

00 A9 AB 89 EF CD

and the "T_32_L" version would output this sequence:

A9 00 EF CD AB 89

If the macro was written as:

.cpu T_32_M

.macro LDA, ?addr
.word $00A9
.long ?addr
.endm

the output sequence would be:

00 A9 89 AB CD EF

It's that "mixmaster" thing the very first version does with 32-bit values that HXA doesn't know how to do natively. If that's what the output sequence actually should be, that's one of the things an assembler will need to learn.

Top

BigEd

Post subject: Re: 65Org16 Assembler (16-bit bytes, 32-bit address space)

Posted: Fri Jun 03, 2011 8:56 am

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10986
Location: England

Hi TT

teamtempest wrote:

So given, say, "LDA 89ABCDEF", ... the "T_32_L" version would output this sequence:

Code:

A9 00 EF CD AB 89

OK, and in fact (I now see) that makes good consistent sense.

(Edit: as EE pointed out later, I read this as opcode 00AD, LDA absolute, which does take a 32-bit argument. Please adjust the following text accordingly)

You're loading the 'byte' at $89AB_CDEF, so you need to see 'bytes' in the order

'opcode', 'CDEF', '89AB'and each of those 'bytes' you are outputting in least-significant-octet-first order, which is all internally consistent. Given the right choice of Intel Hex format that might well be fine, and if it works without extra work that's great!

If we're ever tempted to group the pairs of octets into 16bit 'bytes' I think they must be presented as

Code:

00A9 CDEF 89AB

but as long as the assembler and the loader agree on what's meant, we're all OK.

(It's not uncommon for serialisation to be LSB-first...)

Cheers
Ed

Edit: oops, I'd swapped the two 16-bit values in the final example!

Last edited by BigEd on Fri Jun 03, 2011 3:28 pm, edited 2 times in total.

Top

ElEctric_EyE

Post subject: Re: 65Org16 Assembler (16-bit bytes, 32-bit address space)

Posted: Fri Jun 03, 2011 11:31 am

Joined: Mon Mar 02, 2009 7:27 pm
Posts: 3258
Location: NC, USA

teamtempest wrote:

...So given, say, "LDA 89ABCDEF", the "T_32_M" version would output this sequence:

00 A9 AB 89 EF CD
...

Now you've gone to 32 bits? Not sure what you're saying here...
But, if you're trying to say you want to express LDA $89ABCDEF, it would have to look like:

Code:

LDA $89ABCDEF 00AD CDEF 89AB

_________________
65Org16:https://github.com/ElEctric-EyE/verilog-6502

Top

BigEd

Post subject:

Posted: Fri Jun 03, 2011 12:02 pm

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10986
Location: England

Hi EE
I just had to edit my post because I'd scrambled my final example.

The 32 bits is fine, because it's an address.(*)

Edit: I'm still sure we're going to want to use the L version of HXA, not the M version.

Cheers
Ed

(*)Edit: Because I was thinking of 00AD, LDA absolute, all along.

Last edited by BigEd on Fri Jun 03, 2011 3:29 pm, edited 1 time in total.

Top

ElEctric_EyE

Post subject:

Posted: Fri Jun 03, 2011 12:34 pm

Joined: Mon Mar 02, 2009 7:27 pm
Posts: 3258
Location: NC, USA

Ah, it's good we're talking about this! Sometimes hard to wrap ones head around a new idea. I'm learning here too, but this is what I am understanding:

In immediate LDA, you can only load a 16 bit value, not 32 bits. That opcode is $00A9.

In absolute LDA, you can load a 16 bit value from a 32 bit address. That opcode is $00AD. You can also load a 16 bit value from a 16 address, absolute zp. That opcode is $00A5.

_________________
65Org16:https://github.com/ElEctric-EyE/verilog-6502

Top

BigEd

Post subject:

Posted: Fri Jun 03, 2011 3:24 pm

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10986
Location: England

ah, right. This should be 00AD then, I think! I didn't bother to look up(*) what the opcodes actually are, but there's no # and there's a 32-bit operand, so that seems most likely.

Cheers
Ed

Edit: I've annotated my posts, to reduce confusion without rewriting history.

(*) Edit: Embarrassing - surely I knew that A9 is LDA immediate.

Top

teamtempest

Post subject:

Posted: Fri Jun 03, 2011 11:41 pm

Joined: Sun Nov 08, 2009 1:56 am
Posts: 411
Location: Minnesota

Hmm, screwed up again. I did mean LDA absolute, not LDA immediate. I even looked up an on-line reference to make sure I had the right opcode...apparently I found an unreliable source!

There is of course no possibility I mis-read it :roll:

Anyhow...as has been pointed out, the HXA_T variant of HXA does not understand any instruction set, but several of the demos that come with it implement 65xx instruction sets as macros. I note in passsing that writing the macros was instructive enough that I incorporated what I learned back into HXA65, the variant that understands these instructions natively, to make it more efficient at doing that.

One approach to creating a new assembler would be to modify the file 'i6502.a', which contains macros implementing the NMOS 6502 instruction set. It would be simple if it was just a matter of replacing all the 'BIT08' pseudo ops with 'BIT16' and all the 'BIT16' pseudo ops with 'BIT32', but there is still the problem that the program counter should advance only half as fast as it does. So some modification of the operand fields would also be necessary (for those instructions which have operands, anyway).

I thought of retarding the PC within each macro by using the last instruction to set it back half the number of 8-bit bytes generated, but there is a limitation within HXA which currently makes this possible only 1023 times (so programs couldn't be more than 1000 or so instructions long). Also I haven't entirely worked out how this would affect relative branch calculations.

Mmm, and also there are data storage pseudo ops like 'STRING', 'HEX' and so on. Presumably these should be modified somehow, either natively or via macro, to always output some multiple of 16 bits.

The main reason I'm harping so much on 'what bytes in what order?' is that I've largely come to view an assembler as a tool for doing exactly that: specifying what bytes in what order. Internally HXA simply maintains a sequence of [type, value] pairs, where 'type' is usually one of the '-BIT--' pseudo ops and 'value' is a 32-bit integer. It's only at output time that HXA uses the '-BIT--' type to determine what bytes of each value to extract and in what order.

If you look at the 'i6502.a' and related instruction set files closely, you'll see that that's what all the macros amount to. If you look at the file 'a_ins65x.awk' (the only difference between HXA_T and HXA65), you'll see that essentially it's doing the same thing, spitting out ['-BIT--', value] pairs (much faster, of course).

Having just written this, it's finally occured to me that there's absolutely nothing to stop me from defining new '-BIT--' types that have the desired properties - 16-bit values having size one as far as the PC goes, for instance - while keeping all the current types. A particular native assembler or macro instruction set would use just the types it was interested in.

So...16-bit values, opcodes or operands, are not much trouble. There are only two choices (2!) for outputting each 8-bit 'nybble'. But there are 24 ways (4!) to output a 32-bit value in terms of 'nybbles' (if I counted correctly). HXA is agnostic when it comes to this sort of thing; it currently knows internally only two of the 24 ways but that's only because other orders haven't been made known to it. There's no fundamental difficulty with implementing other orders, it's just a question of exactly which one(s).

The answer to that question is what I'm after. What does the proposed CPU expect to see? If it's this, as EE suggests:

Quote:

LDA $89ABCDEF ; 00AD CDEF 89AB

then this macro (also shown earlier) creates it:

Code:

.cpu T_32_M      ; MSB-first order

.macro LDA, ?addr
.word $00AD      ; opcode
.word ?addr,     ; lo 16 bits
.word ^(?addr)   ; hi 16 bits
.endm

Though a native version would do this faster, and also not have to evaluate '?addr' twice.

Top

GARTHWILSON

Post subject:

Posted: Sat Jun 04, 2011 2:52 am

Joined: Fri Aug 30, 2002 1:09 am
Posts: 8546
Location: Southern California

Quote:

Mmm, and also there are data storage pseudo ops like 'STRING', 'HEX' and so on. Presumably these should be modified somehow, either natively or via macro, to always output some multiple of 16 bits.

UTF-8 might be a nice thing to drop in.

Top

BigEd

Post subject:

Posted: Sat Jun 04, 2011 9:46 am

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10986
Location: England

GARTHWILSON wrote:

Quote:

Mmm, and also there are data storage pseudo ops like 'STRING', 'HEX' and so on. Presumably these should be modified somehow, either natively or via macro, to always output some multiple of 16 bits.

UTF-8 might be a nice thing to drop in.

On this point, I think STRING should output one 'byte' for each octet in the source, on the assumption that the string in question will be output to an 8-bit peripheral and that memory is cheap. If my source contains 'Thanks to André Fachat' then that will be 23 or 24 'bytes' depending on the source encoding: I would not recommend an assembler writer get distracted into all the arcana of text encodings.

On the other hand, HEX... well maybe HEX should be the same, and we need a HEX16 to specify a stream of 'byte' sized constants. Or maybe HEX takes a stream of 'byte' sized constants, not a stream of octets. That would be consistent.

Top

Page 1 of 9

[ 132 posts ]

Go to page 1, 2, 3, 4, 5 ... 9 Next

Board index » 6502.org Users Forum » Programming

All times are UTC

Who is online

Users browsing this forum: No registered users and 2 guests

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum