65Org16 Assembler (16-bit bytes, 32-bit address space)

teamtempest · Post by **teamtempest** » Tue Jun 28, 2011 4:29 am

Quote:

If you could post a short example of loader format

Well, I've implemented the "boost initial char" approach for Intel/Motorola 16- and 32-bit formats. If a short standard 8-bit Intel record for a 32-bit wide address space looks like this starting at $FFFFE000:

Code: Select all

:02000004FFFFxx
:08E000000001020304050607xx
:08E0080008090A0B0C0D0E0Fxx
:00000001FF

then it'll look something like this for 16-bit "bytes"

Code: Select all

;02000004FFFFxx
;10E0000000000001000200030004000500060007xx
;10E0080000080009000A000B000C000D000E000Fxx
;00000001FF

Notice:
- the "record start" character changes from ":" to ";"
- the "data byte count" remains octet-correct
- the "address offset" is 16-bit correct

Data lines get longer because I multiply the "chunk size" of a line (8-32, default 16) by the "byte size". That seemed to be the simplest way to avoid the possibility of "breaking" a "byte" across two lines, given that I allow "chunk" sizes to be odd-numbered values. However it does mean in this implementation the maximum number of octets rises to 32*4=128 (with a 32-bit "byte") and that in turn means up to 256 data chars.

Makes a pretty long output line.

I still like the idea of boosting the record type, but if I pursue that I'll probably wait for the next version of HXA. The compiler likes to allocate things in 8K chunks and I'm bumping up against the end of one of those now. On the plus side, it's given me the opportunity to revise some code into shorter versions. More understandable, even.

teamtempest · Post by **teamtempest** » Tue Jun 28, 2011 4:39 am

Quote:

Oh and I changed the expression evaluation to 64-bit to allow .LONG to work properly.

*jealous* I can only manage 32 bits with my current compiler, so I can only do BYTE and WORD

teamtempest · Post by **teamtempest** » Tue Jun 28, 2011 5:02 am

Quote:

Hi Bitwise, that's looking good! Not sure about this bit tho:

Code:
00000014' 00000000 : .BYTE LO ($+2),HI ($+2)

This reminds me of something I've thought of only recently, that being the behavior of the "<" and ">" and "^" bit-extraction operators.

So far just about all the changes I've made have been confined to one source module, the "code generator". It turns out that nothing outside it really cares about how big a "byte" is (including the module that keeps track of the program counter, however non-intuitive that may seem).

Except, perhaps, these operators. I consider them short-hand for "shift and mask" expressions, that is, you can get the same effect by applying a shift followed by masking out unwanted bits (which is how they're implemented internally anyway).

As things stand I haven't changed their behavior. But that means using "^value" instead of ">value" to get the "high byte" (bits 31..16 of a 32-bit value instead of bits 15..08 of that same value).

If I change their behavior to "expected" then they'll become useless in the case of 32-bit bytes (only "<" would work, and so what?) and there will be no built-in way of accessing any octet of a 32-bit value. If I don't change them there's the nuisance of learning a new convention.

Hmm, if I implemented LO() and HI() functions I could change their behavior to match the "byte size" without changing the existing operators. HI() would presumably always return zero in a 32-bit byte world, at least until I have 64-bit values to play with...

Any thoughts?

=======

Slow learner. Okay, seeing as these "shortcut" operators were implemented internally via function calls anyway, it's really not a whole lot of trouble to change the function that's called. Perhaps to something that is "byte size" aware...

So I did. "<"= least significant "byte", ">" = "next-to-least significant byte", "^" = "most significant word". They all work for 8-, 16- and 32-bit "bytes", but may return zero if what's asked for doesn't exist in this implementation (eg., in the 32-bit world only "<" can be non-zero, 'cause there isn't anything larger. Kinda pointless, but at least consistent!).

You want individual octets within the "bytes"? Ah, it's shift-and-mask time in the 16- and 32-bit realms.

And...I think that's it. I can't think of any other basic code changes that need to be made. I can cut down the native 65xx version to make a preliminary 65Org16 native version, but that's all that might be left.

BigEd · Post by **BigEd** » Wed Jun 29, 2011 7:06 pm

This is all sounding great - do you have a code drop I could use?

BitWise · Post by **BitWise** » Wed Jun 29, 2011 10:40 pm

Highly experimental draft assembler and linker.

http://www.obelisk.demon.co.uk/files/65016.zip

Unzip and open a command prompt in the directory then use NMAKE to assemler boot.asm and link into output files. Linux lovers will need to tweak the Makefile a little. Needs a java 1.6 JRE/JDK installed on the command path. (Type java -version to check for one).

The binary and hex file outputs from the linker look OK but need more testing. The S19 output is bugged.

I'm off for some sleep

teamtempest · Post by **teamtempest** » Thu Jun 30, 2011 11:55 pm

Quote:

do you have a code drop I could use?

My plan at this point is to go back and make sure none of the changes has messed up anything else. A test suite is so helpful with that

, but I think I'm going to add something new that verifies identical behavior with the existing version where tests haven't changed between versions. Just checking for matching error/non-error behavior has let little things slip through before.

If that doesn't turn up anything horrible then I want to document HXA as it now exists and post it as v0.180. With any luck, should be available this weekend. It's miserably hot in the workroom today (no A/C and a heat index over 100), so there might not be much progress before then.

teamtempest · Post by **teamtempest** » Tue Jul 05, 2011 5:19 am

Quote:

but I think I'm going to add something new that verifies identical behavior with the existing version where tests haven't changed between versions

Well...that turned out to be 36K lines of differences, mainly because I modified file listings. Hardly "identical". I only looked at the first 6K, found a couple of things I didn't like, and changed them.

Anyway, version 0.180 is up at

http://home.earthlink.net/~hxa

The only lapse is that the on-line copies of the test programs are not properly updated due to alphabetic case issues and the sheer number of changes I have yet to make (god I hate case-sensitivity). All the rest of the on-line documentation should be okay, and the various *.ZIP downloads should be fine.

BigEd · Post by **BigEd** » Tue Jul 05, 2011 10:19 pm

TT, Bitwise - thanks for the updated assemblers. I managed very little technical stuff over the weekend, but I do plan to port the hex loader to both syntaxes (unless a single syntax suits both)

I did get a listing from BitWise's assembler, from which I might have tried to extract a hex dump, but of course some of the hex is still zero at that point, so probably not worthwhile. I did that by using cpp to process the #defines, but really I'd want to rewrite them as labels.

Cheers
Ed

teamtempest · Post by **teamtempest** » Wed Jul 06, 2011 4:41 am

It's probably already occurred to you, but it's just occurred to me (slow learner, right?) that the 16/16 address format of an Intel 32-bit address works out really nicely as a 6502-style 2-byte pointer: record type '4's data field is just the high "byte" of the address, and record type '0's offset field is just the low "byte". They're even in the proper order as far as internal octets go, if I understand that correctly. Read 'em and store. Nothing to it! Set the Y-register to zero and you're all set to read and store data octets.

Or, hmm, set the low "byte" of the pointer to zero and set the Y-register to the offset. The only hazards would be the possibility of the offset + data bytes running over a 64K boundary, or the offset "wrapping" at that point. Those should never happen, but you never know when you'll run into a badly formed record.

We need not go into how I know about badly-formed records...

BitWise · Post by **BitWise** » Wed Jul 06, 2011 8:18 am

BigEd wrote:

I did get a listing from BitWise's assembler, from which I might have tried to extract a hex dump, but of course some of the hex is still zero at that point, so probably not worthwhile. I did that by using cpp to process the #defines, but really I'd want to rewrite them as labels.

If you use .ORG (or *=) to set an origin within the .CODE section then the assembler will generate absolute address (as in the example boot ROM). You still need to link to convert the .OBJ output into a binary or hex file.

You only get zeros in the listing when expressions are not resolvable without linking.

BigEd · Post by **BigEd** » Thu Jul 07, 2011 5:50 am

BitWise wrote:

You only get zeros in the listing when expressions are not resolvable without linking.

Ah, top tip. I shall perform the comparison!

TT: I immediately notice that the

Code: Select all

  .string

directive is packing characters in a big-endian fashion, unlike .byte:

Code: Select all

 0000:F81F  53 00 65 00         .string "Send 6502 code in"
 0000:F853  00 0D               .byte 13,10

Because I'll be talking to an 8-bit peripheral connected to the LSByte, I will be needing my strings to be little-end justified, which is what BitWise does. Is this something I can fix in source, or is this a bug report?!

(BitWise's tool will allow a string argument to a .byte directive:

Code: Select all

0000F81F  00530065006E0064> :         .byte "Send 6502 code in"

but I think HXA does not.
)

Thanks to both of you, of course - I could use either tool as-is, if I had to, and both are a step up from the previous one.

Cheers
Ed

BigEd · Post by **BigEd** » Thu Jul 07, 2011 7:37 pm

Here's a curious one for you Bitwise: the loading of some binary constants isn't coming out as expected:

Code: Select all

0000F97F  00A92B67          : INITSER lda #00011111b ;
0000F981  0085F000          :         sta $F000 ; 
0000F983  00A903F3          :         lda #00001011b ;

(It's very handy having two tools to compare!)

teamtempest · Post by **teamtempest** » Thu Jul 07, 2011 11:23 pm

Quote:

TT: I immediately notice that the
Code:
.string

directive is packing characters in a big-endian fashion, unlike .byte:

Code:
0000:F81F 53 00 65 00 .string "Send 6502 code in"
0000:F853 00 0D .byte 13,10

Because I'll be talking to an 8-bit peripheral connected to the LSByte, I will be needing my strings to be little-end justified, which is what BitWise does. Is this something I can fix in source, or is this a bug report?!

".string' is actually doing what I told it to (and as documented), so if there's a bug it's a failure of my imagination. The sequence to me looks little-endian; it's the ".byte" sequence that looks big-endian.

I assume you initialized with a sequence like this:

Code: Select all

.cpu T_32_L16
.assume BIT16=10, BIT32=1032

Try this instead:

Code: Select all

.cpu T_32_M16
.assume BIT32=1032, BIT32R=3210

That should swap the order of octets in string characters. BIT16 (aka ".byte") will "naturally" have octet order 10 in an MSB-first machine, so it doesn't have to be changed. BIT32 (aka ".word") will "naturally" have octet order 3210, so it does have to be changed. If you never plan to use reversed order words you don't need the BIT32R assumption, but it's there for completeness (and won't hurt anything if present).

But it never occurred to me that ".string" should be affected by byte sequence changes. If strings are considered sequences of bytes, perhaps it should have.

Quote:

(BitWise's tool will allow a string argument to a .byte directive:
Code:
0000F81F 00530065006E0064> : .byte "Send 6502 code in"
but I think HXA does not.
)

True. OTOH, ".string" arguments can be either string or numeric expressions (which will get cut down to the least significant octet).

BigEd · Post by **BigEd** » Fri Jul 08, 2011 10:59 am

Hi TT
I tried changing the assume statements, but it changed the output code very significantly. As things stood, your tool and BitWise's are identical apart from the .string behaviour (and the binary constant.)

(This is parenthetical, because I think it's a diversion...) The first divergence is that

Code: Select all

sta $F000

changes from

Code: Select all

^0000:F8A2  00 85               .byte ]opc
^0000:F8A3  F0 00               .ubyte val(]adr$)

to

Code: Select all

^0000:F8A6  00 8D               .byte ]opc+8                                                                        
^0000:F8A7  F0 00 00 00         .uword val(]adr$)

To revisit the string behaviour: a string constant in the source is a series of octets, and we want it assembled into a series of (padded) 16-bit bytes. BitWise places the input octet into the LSB, and HXA presently places the input octet into the MSB.

That is, HXA produces:

Code: Select all

0000:F81F 53 00 65 00 .string "Send 6502 code in"

where I expect to see

Code: Select all

0000:F81F 00 53 00 65 .string "Send 6502 code in"

because when I load from 0000F81F, I expect to load 0053, an 'S'

That is, I don't think there's any word-order problem here, so no need to change the assume statement.

Cheers
Ed

teamtempest · Post by **teamtempest** » Fri Jul 08, 2011 11:00 pm

Okay, two separate issues.

Quote:

I tried changing the assume statements

Good but not enough. The ".cpu" statement changes as well, from

Code: Select all

.cpu T_32_L16

to

Code: Select all

.cpu T_32_M16

Changing from LSB to MSB is what swaps the octet order in string characters, as HXA will put the character octet in the least-significant postition and zero-fill the rest. For an 16-bit "byte" LSB descriptor that becomes "XX 00", and for an MSB that becomes "00 XX".

Tested that again; works as described.

Now for an LSB descriptor HXA by default extracts octets in the order 0->01->012->0123, for 8-, 16-, 24- and 32-bit quantities respectively. For 16-bit values you want the order "10" and for 32-bit quantities the order "1032", which is why the ".assume" is used to specify those. Reversed 32-bit quantities should come out "3210", but they already do by default, so there's no need to change that.

For an MSB descriptor the default extractions are 0->10->210->3210. The 16-bit value is in the the proper order, but the 32-bit values, normal and reversed, are not, so they need to be changed via ".assume".

That should take care of the first issue.

Quote:

(This is parenthetical, because I think it's a diversion...)

Actually it's not. It's a bug of sorts. In the file "i6502.a" there is this:

Code: Select all

        .if cpu$() ~ /L16/    ; 16-bit "byte" (and 65K "zero page") ?
abs_mask    .equ    $FFFF0000
        .else               ; assume 8-bit byte (and 256-byte "zero page")
abs_mask    .equ    $FFFFFF00
        .endif

Look at the listing. "abs_mask" has the value $FFFFFF00, because changing the descriptor from "_L16" to "_M16" causes the match to be false and so "abs_mask" gets the second value, not the first.

Thus the address $0000FF00 becomes non-zero when masked, and is taken as absolute, rather than zero page.

Changing the code to this:

Code: Select all

        .if cpu$() ~ /[LM]16/    ; 16-bit "byte" (and 65K "zero page") ?
abs_mask    .equ    $FFFF0000
        .else               ; assume 8-bit byte (and 256-byte "zero page")
abs_mask    .equ    $FFFFFF00
        .endif

should be one way to make the match succeed and the proper value given to "abs_mask".