Yet anther assembler...

drogon · Post by **drogon** » Sun Nov 03, 2019 8:38 pm

hmn wrote:

- ca65 (cc65) only has listing output for it's relocatables, so you only see placeholder addresses ("rr rr")

In defence of ca65 (mostly because it's what I use right now), it's really part of a "classic" Unix style compiler/assembler/linker suite rather than a stand alone assembler like a "normal" 8-bit implementation might be, so it's really intended to assemble separate modules then link them together to produce the final object file.

You can get fully qualified output when you switch relocation off - ie. use a .org in your code - then you'll be simply including source files, if you have a project in separate files.

Some ca65 list output from my first stage bootstrap loader in Ruby:

Code: Select all

000000r 1                       .org bootBase
000400  1               
000400  1               ; The usual reset initialisation
000400  1               ;       This really runs from $FF00
000400  1               
000400  1               reset:
000400  1  D8                   cld                     ; Standard 6502 reset code - We don't strictly need to do this, but ...
000401  1  78                   sei
000402  1  A2 FF                ldx     #$FF
000404  1  9A                   txs
000405  1               
000405  1  8E 30 FE             stx     VIA_DDRA
000408  1  A9 7E                lda     #$7E
00040A  1  8D 10 FE             sta     VIA_ORA
00040D  1               
00040D  1               ; Move ourselves from $FF00 to bootBase ($0400)
00040D  1               ;       We're < 256 bytes, so a single indexed loop is all we need
00040D  1               ;       and we'll just copy all 256 bytes ...
00040D  1               
00040D  1  A0 00                ldy     #$00
00040F  1               reloc:
00040F  1  B9 00 FF             lda     load0,y
000412  1  99 00 04             sta     bootBase,y
000415  1  C8                   iny
000416  1  D0 F7                bne     reloc
000418  1               
000418  1               check:
000418  1  B9 00 FF             lda     load0,y
00041B  1  D9 00 04             cmp     bootBase,y
00041E  1  D0 05                bne     copyErr
000420  1  C8                   iny
000421  1  D0 F5                bne     check
000423  1  80 05                bra     copyOk
000425  1               
000425  1               copyErr:
000425  1  AD 10 FE             lda     VIA_ORA
000428  1  80 FB                bra     copyErr
00042A  1               
00042A  1               copyOk:

Cheers,

-Gordon

dclxvi · Post by **dclxvi** » Mon Nov 04, 2019 5:20 am

Druzyek wrote:

Do you think it happens often in practice that an 8 bit symbol becomes 16 bit but could be shrunk back to 8 bit after other symbols are resolved?

How about this?

Code: Select all

ORG $FD
   LDA $1FF-LABEL
LABEL

Pass 1: at LDA, LABEL isn't defined yet, so choose absolute addressing; at the definition of LABEL (on the next line), LABEL is $0100

Pass 2: absolute addressing was chosen in pass 1, but $1FF-LABEL is $00FF, so let's choose zero page addressing instead, and schedule pass 3; LABEL is $00FF

Pass 3: zero page addressing was chosen, but $1FF-LABEL is $0100 so absolute addressing must be chosen instead; LABEL is $00FF

Pass 4: same as Pass 2 and around and around we go unless we put a stop to this somehow

Obviously, that is not a piece code anyone is going to write. This looks pretty normal, though:

Code: Select all

ORG $1000
   LDX #0
   LDA #$FF
.1 STA BUF+$1FF,X
   INX
   BNE .1
   JMP START
   DB "0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF" ; padding
   DB "0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF"
   DB "0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF"
   DB "0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF012"
BUF
   DS $300

That assembles without any trouble. Most of us don't usually type things in correctly the first time, though, and if I instead (mis)type ORG $000 and STA -BUF+$1FF,X then this is the same as the earlier situation.

Obviously, it's highly unlikely that you'd actually encounter such a situation (making those particular typos, BUF just happening to be on a page boundary, etc.), but if the planets ever do align, the thing to consider here is what the user will experience.

You certainly don't want the assembler to appear to hang (even in an extremely rare situation) merely because there are typos in the user's source code, so limiting the number of passes is wise.

With a traditional 2 pass assembler (use absolute addressing if a label isn't/wasn't yet defined in the first pass), the source code will assemble without errors, but the user (ultimately) will be able to find the typos after running the program and/or examining the list file output.

What is the user interested in? Getting the program working, and the first step is finding the typos. There are a couple of good approaches. If you can catch the problem at assemble time and point the user to a specific line or lines, great. If that's too hard, it will be helpful to the user to give not-what-the-user-expected assembly output, which the user can use to try to figure out why the unexpected result happened.

Druzyek · Post by **Druzyek** » Wed Nov 13, 2019 9:30 pm

The case where you have code on the border with zero page does seem pretty rare. I was thinking more about where a label is the input into an assembler function or macro which outputs code or data and the size of that depends on the label. If the label comes after the function, you could get an endless loop like you describe above. This might be a little harder to imagine but when any arbitrary calculation is allowed on a label in a macro, the problem seems a lot more likely than code on the edge of zero page.

barnacle · Post by **barnacle** » Tue Nov 19, 2019 12:01 pm

Bah - I added high and low modifiers to my expression evaluator and killed the damn thing. Now it can't do +-*/ which is a pain. Oh well, shouldn't be too much trouble to sort... I didn't notice it at the time as my stress test didn't have any expressions in it

As an aside, is there a generic N6502 and/or WDC65C02 stress test available anywhere? The obvious candidates are the microsoft basic and figforth listings, but they're both N6502 only, I think.

Neil

John West · Post by **John West** » Tue Nov 19, 2019 12:59 pm

I'm not aware of a stress test specifically for assemblers. There are several for the instructions, of course. One difficulty would be the wide variety of syntaxes in use, and of the different features supported. For a fully comprehensive test, you'd really need a different one for each assembler. Or write a program that can parametrically generate a test targeting your particular assembler.

I've been using Microsoft BASIC and C64 KERNAL for mine, after some semi-automated massaging to get it into a format that my assembler will accept (the worst problem was their frequent use of leading 0 to mean 'octal', which I refuse to support). But that's far from perfect - this large complicated program, filling nearly a quarter of the 6502's addressable memory, never uses the (zp,X) addressing mode. Guess which addressing mode my assembler failed to parse correctly.

johnwbyrd · Post by **johnwbyrd** » Wed May 13, 2020 7:24 am

This question is almost a FAQ for assembler authors. The "correct" answer is non-trivial.

https://eli.thegreenplace.net/2013/01/0 ... relaxation

BitWise · Post by **BitWise** » Wed May 13, 2020 8:55 am

johnwbyrd wrote:

This question is almost a FAQ for assembler authors. The "correct" answer is non-trivial.

https://eli.thegreenplace.net/2013/01/0 ... relaxation

I added JEQ, JNE, JCC, etc. to my assembler a few years ago. They generate a relative branch if possible otherwise an opposite branch over a JMP. I was playing with a compiler for something and this simplified the code generation. The structured programming commands in my assembler also do this.

johnwbyrd · Post by **johnwbyrd** » Wed May 13, 2020 7:03 pm

Quote:

I added JEQ, JNE, JCC, etc. to my assembler a few years ago. They generate a relative branch if possible otherwise an opposite branch over a JMP. I was playing with a compiler for something and this simplified the code generation. The structured programming commands in my assembler also do this.

Yes, this technique is as old as the 6502 itself -- Bill Gates used it.

That said, all instruction sets (not just the 6502) are full of cases where the addressing mode should change, based on the relative offset or the size of the destination pointer. For example, the 6502 has tons of instructions of the form PPP QQQQ,X where the opcode for PPP changes based on whether QQQQ is an 8 bit or a 16 bit address. A good assembler/linker combination will be able to choose the correct form, but will also be able to deal with the cascading effects of changing the length of one instruction, thus requiring other nearby instructions to change addressing modes as well.

Yet anther assembler...

Re: Yet anther assembler...

Re: Yet anther assembler...

Re: Yet anther assembler...

Re: Yet anther assembler...

Re: Yet anther assembler...

Re: Yet anther assembler...

Re: Yet anther assembler...

Re: Yet anther assembler...