6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Sun Sep 29, 2024 5:28 pm

All times are UTC




Post new topic Reply to topic  [ 35 posts ]  Go to page Previous  1, 2, 3  Next
Author Message
 Post subject:
PostPosted: Wed Jan 23, 2008 12:39 am 
Offline
User avatar

Joined: Thu Mar 11, 2004 7:42 am
Posts: 362
blargg wrote:
This is assuming you manually allocate zero-page, rather than just use .zeropage and let the assembler do it. For a list of addresses, I do like your idea of an "address list" mode.


I was once among the many 6502 programmers who were in the habit of defining zero page usage with EQUs rather than DS (uninitialized storage, it just advances the ORG address) with an ORG on the zero page. Bad habit or not, the former is pretty common.

However, for zero page locations that are input/output parameters of ROM routines it makes sense to use EQUs, e.g.

Code:
ROMSUB1PARM1 = $FD
ROMSUB1PARM2 = $FE
ROMSUB1      = $F000
ROMSUB2PARM1 = $FB
ROMSUB2      = $F123


blargg wrote:
So for the stack-pointer-in-direct-page-register example, you'd just do LDA <5 to use direct-page addressing mode.


That's less typing, but it still has to be done numerous times. Starting from the "Example Implementing strtok()" section in the wiki (just after the tsc-tcd concept is introducted), by my quick count, there are 9 instances (LDA, CMP, etc. but not STA/STX/STY) where you'd have to do this. All but one block of assembly code (which only contains two JSRs and several .word directives) from that point in the article contains at least one instance. And these are intentionally short and simple examples. Most procedures you write are going to have at least one instance.

Another example is that address-address can be either a constant or an address, e.g.

Code:
     LDX #0
LOOP LDA STR,X
     JSR SUB
     INX
     CPX #STR_END-STR  ; address - address = constant
     BCC LOOP
     RTS
STR  .byte "some string"
STR_END


vs.:

Code:
DP_BASE = $1000     ; address
VAR1    = DP_BASE   ; address
VAR2    = DP_BASE+2 ; address

SUB1
;
; Set up VAR1 for later
; The D register doesn't point at DP_BASE yet
;
     ASL
     STA VAR1          ; uses abs (or long) addressing here
     RTS
SUB2
;
; Set the D register to DP_BASE (e.g. for speed)
;
     LDA #DP_BASE
     TCD
;
; Start by copying VAR1 to VAR2
;
     LDA VAR1-DP_BASE ; address - address = dp address
     STA VAR2-DP_BASE


That's not the only way to do either of those things, but I think both examples are reasonable approaches.

kc5tja wrote:
Therefore, maybe it's high time that the 6502 and 65816, perhaps the most non-orthogonal architectures on the face of the planet today, just drop the idea of "addressing modes" all-together, and instead switch to explicit mnemonics for everything. Truely, this would make writing assemblers a *LOT* easier.


You can make a case for either. I would argue against putting the zp/abs (or zp/abs/long on the 65816) distinction into the instruction mnemonic, even if the ,X ,Y (,X) (),Y etc. are. It's not as big a deal on the 65816, but on the 6502/65C02, if someone's example is written for abs and I want to use zp (or vice versa), I'd prefer to just make the single change of LABEL = $1000 to LABEL = $00 rather than edit numerous instructions. Yeah, this could be handled with a boatload of macros (one for each instruction) that check if the argument (label) is > $FF, for those variables that can be either abs or zp. But that seems kinda one step forward, one step back to me.

Speaking of macros, they can be simpler without the distinction in the instruction, e.g.

Code:
MOV2A MAC    ; 6502/65C02 (.n are the macro parameters)
      LDA .1
      STA .2
      LDA 1+.1
      STA 1+.2
      EOM


This can handle any combination of abs, zp, abs,X, zp,X, and abs,Y. When the distinction is in the instruction, with really good macro capability, the macro could be invoked with something like MOV2A AX VAR1 AX VAR2 and then the macro would append the AX to the LDA and STA to form LDAAX and STAAX, but that's still a more complicated macro.

The assembler only has to be written once. I'm not sure that making the assembler itself easy to write is all that important.

kc5tja wrote:
Code:
    ldxiw 16                        ; LDXIW = LDX #nnnn
nextBit:                            ; LDXIB = LDX #nn
    lsrd multiplier                 ; it'd do away with .a8/.i16 etc.


Some assemblers avoid the .a16/.a8 nonsense by using # to indicate 8-bit immediate data and ## to indicate 16-bit immediate data, e.g.

Code:
LDA #0  ; assembles A9 00
LDA ##0 ; assembles A9 00 00


GARTHWILSON wrote:
That's why my Forth assembler is that way. It made it much easier.

(snip)

The whole assembler is small enough to keep in memory all the time.


Um, 6502 assemblers usually are pretty simple in Forth in any case.

I wrote an ugly 6502 prefix assembler (I should've just made it postfix and I should've implemented labels differently) which is both a simplification and an expansion of the FIG-Forth 6502 assembler. There are colon definitions for JMP and JSR, and only 4 defining words: one for addressing modes, one for implied instructions, one for relative branch instructions, and one for the remaining instructions. It includes branch range checking, and ORG capability, i.e. ORG need not equal HERE (it was intended as a cross assembler for the NMOS 6502). It doesn't include forced zp or abs, though that would be really simple to add (probably only a few dozen bytes or so). It only takes 775 bytes of dictionary space and 690 bytes of head space (heads are separated in this particular indirect-threaded Forth). A significant percentage of it consists of using the defining words to define the instructions (ADC, AND, etc.) It's not as simple as what you're doing, but as poorly designed as it is, it was still pretty easy to do.


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Wed Jan 23, 2008 7:58 am 
Offline

Joined: Tue Jul 05, 2005 7:08 pm
Posts: 1041
Location: near Heidelberg, Germany
For what ir's worth, the xa64 assembler (shamless self plug :-)
has address segments for labels. So it knows whether a label
is in zeropage, text (code), data or bss (uninitialized).
IIRC it actually issues warnings in certain types of situations
(e.g. adding addresses, but not substracting addresses, which is
useful)

segments are started with pseudo opcodes (e.g. ".text", ".zero")
and assigning addresses to labels works using the "*" as "PC"
in the assignement or by using the label name in front of the
opcode. Not using "*" results in an absolute value.

e.g.

Code:
.text
dataval=15
label1=*
label2 LDA #<dataddr1
       LDX #dataval
.data
dataddr1 .byt 1,2,4



just my .02€
André

[/code]


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Wed Jan 23, 2008 8:58 am 
Offline
User avatar

Joined: Tue Mar 02, 2004 8:55 am
Posts: 996
Location: Berkshire, UK
There have been a number of assemblers that have used this approach in the past. Lee's site has a listing of a KIM-1 program that uses suffixed instructions and the original source of Microchess does as well - although not for absolute instructions.
http://www.themotionstore.com/leeedavison/6502/memoryplus/21.html
http://www.floodgap.com/retrobits/ckb/secret/uchessasm.txt
I did some work for Acorn in the late 1980's (a simple database system squeezed into a 16K ROM) and used their macro assembler which also worked this way. It certainly compiled very quickly compared to other assemblers on the BBC but I can't say I was all that keen on suffixed opcodes. Whilst they make the job of determining the addressing mode easier for the assember (especially for indirect instructions) I think the original notation expresses the addressing action of the more clearly.

At some time or other I probably have missed a # out of a immediate instruction but such mistakes are pretty easy to spot and I'm not sure all the additional baggage needed to allow the assembler to spot it is really worth it. There are plenty of other ways I could make mistakes that break my code that the assembler can't spot, does this one really need special treatment?

I can see the value of using suffixed opcodes in a Forth assembler but why have you opted to have address arguments after the operand rather than before (as in normal Forth code)? I would have thought that something like ...
Code:
: LDAIM, ( n -- ) A9 C, C, ; IMMEDIATE

FF LDAIM,

... would have been more in the spirit of Forth.

_________________
Andrew Jacobs
6502 & PIC Stuff - http://www.obelisk.me.uk/
Cross-Platform 6502/65C02/65816 Macro Assembler - http://www.obelisk.me.uk/dev65/
Open Source Projects - https://github.com/andrew-jacobs


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Wed Jan 23, 2008 10:31 am 
Offline

Joined: Tue Jul 05, 2005 7:08 pm
Posts: 1041
Location: near Heidelberg, Germany
BitWise wrote:
At some time or other I probably have missed a # out of a immediate instruction but such mistakes are pretty easy to spot and I'm not sure all the additional baggage needed to allow the assembler to spot it is really worth it. There are plenty of other ways I could make mistakes that break my code that the assembler can't spot, does this one really need special treatment?


xa65 uses the pseudo opcodes to put the code into different sections of
the binary "o65" file, which is relocatable - so the assembler needs to
have an idea which labels are addresses and which are not.

The checks are a nice byproduct

André


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Wed Jan 23, 2008 5:27 pm 
Offline

Joined: Sat Jan 04, 2003 10:03 pm
Posts: 1706
BitWise wrote:
I can't say I was all that keen on suffixed opcodes. Whilst they make the job of determining the addressing mode easier for the assember (especially for indirect instructions) I think the original notation expresses the addressing action of the more clearly.


In my experience, I positively have to disagree with you. I find LDA (dp,X) and LDADX dp to be completely isomorphic.

The only disadvantage is if you use a macro like:

foo .macro
lda .1
sta .2
.endm

then you could conceivably use something like, foo (dp,X) (dp),Y and have it "just work."

However, I have never actually encountered a need for this level of support.

If you really want to discuss what was more descriptive of what was going on, this syntax would have excelled:

a := @dp+X
y+@dp := a

where @ is a monadic operator that means "the contents of" (kind of like * in C). To qualify things like @2 for direct page, absolute, or long, perhaps you would use different operators (e.g., * for absolute) or use type-casts (e.g., long(@2)).

Hmm...I guess I'll have to give that a whirl now to see what that would look like:

Code:
.export unsignedMultiply_by_
.proc unsignedMultiply_by_
    ; Multiply two unsigned 16-bit integers to produce a 32-bit result.
    ;
    ; Inputs:
    multiplicand = multiplier+2     ; One term
    multiplier = rpc+2              ; The other term
    ;
    ; Returns:
    result = multiplicand+2         ; 32-bit result

    push(a, x, y, d)
    rpc = regA+2
    regA = regX+2
    regX = regD+2
    regD = multiplicandHi+2
   
    push(0)
    multiplicandHi = 1

    ; If you use parsing rules similar to the J programming language,
    ; you could write the following like this:  d := a := s
    a := s
    d := a

    @(long(result)) := 0
    x := 16
    loop {
      @multiplier /= 2
      if(carry) {
        @(long(result)) := @(long(multiplicand)) + @(long(result))
      }
      @(long(multiplicand)) *= 2
      x := x - 1
    until(zero)

    pull(a, d, x, a)
    return
.endproc


I will have to play with this concept a bit more.

Quote:
I can see the value of using suffixed opcodes in a Forth assembler but why have you opted to have address arguments after the operand rather than before (as in normal Forth code)?


Laziness to some extent. Writing:

Code:
: opc   create , does> @ c, ;


was simpler than:

Code:
: opc0   create c, does> @ c, ;
: opc1   create c, c, does> dup @ c, 1+ @ c, ;
: opc2   create c, , does> dup @ c, 1+ @ ,abs ;
: opc3   create c, , does> dup @ c, 1+ @ ,long ;
: opcR   create c, does> @ c, ,pc-rel8 ;


then making sure everything lined up nicely.


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Wed Jan 23, 2008 8:44 pm 
Offline
User avatar

Joined: Fri Aug 30, 2002 1:09 am
Posts: 8521
Location: Southern California
Quote:
but why have you opted to have address arguments after the operand rather than before (as in normal Forth code)?

Because I used a commercially available Forth metacompiler for a couple of years which did the assembly in postfix, and I absolutely hated it. However that might have been because I had already done so much "regular" assembly programming. After all my time on postfix assembly, it still looks like Greek to me. The visual factoring is not there, and making sense of it does not come as naturally. I like Forth a lot, but not postfix assembly.

BTW, instead of using
Code:
: LDAIM, ( n -- ) A9 C, C, ; IMMEDIATE

which takes 22 bytes in IDT Forth including the header, nest and unnest, the address of LITERAL's runtime, and the alignment byte, (or 20 bytes without the second C, ), you can use one of Forth's strengths and make a defining word COMP_OP (compile op code) whose use when you're compiling the assembler would look like
Code:
COMP_OP  LDA#  A9

The defining word scans the input stream for the name (LDA# in this case), lays down the header, followed by the CFA which points to COMP_OP's runtime comp_op, followed by the one-byte op code which is also found by scanning ahead in the input stream. When you're assembling, comp-op lays down the A9. It is similar to a one-byte CONSTANT but compiles the A9 instead of putting it on the stack. It only takes three bytes plus the header. This eliminates nest, unnest, lit, 00A9, and C, (plus a second C, if you still want postfix assembly) making the assembler faster and a couple thousand bytes shorter.


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Wed Jan 23, 2008 8:48 pm 
Offline

Joined: Sat Jan 04, 2003 10:03 pm
Posts: 1706
I know this is wildly off-topic now, but I couldn't resist. Garth, your approach is fundamentally correct, but way too complicated. Use CREATE and DOES> -- it factors everything out for you.

Code:
: COMP   CREATE , DOES> C@ C, ;

HEX
A9 COMP LDA#


In fact, I took this one step farther in my assembler's opcode table:

Code:
: ~    CREATE DUP , 1+ DOES> C@ C, ;
: ||   ~ ~ ~ ~  ~ ~ ~ ~ ;

0
||  brk,  ora(,x), cop,   ora,s,     tsb,      ora,    asl,    ora[],   ( 0x )
||  php,  ora#,    asla,  phd,       tsb@,     ora@,   asl@,   ora@l,
||  bpl,  ora(),y, ora(), ora(,s),y, trb,      ora,x,  asl,x,  ora[],y, ( 1x )
||  clc,  ora@,y,  inca,  tcs,       trb@,     ora@,x, asl@,x, ora@l,x,
||  jsr@, and(,x), jsr@l, and,s,     bit,      and,    rol,    and[],   ( 2x )
||  plp,  and#,    rola,  pld,       bit@,     and@,   rol@,   and@l,
||  bmi,  and(),y, and(), and(,s),y, bit,x,    and,x,  rol,x,  and[],y, ( 3x )
||  sec,  and@,y,  deca,  tsc,       bit@,x,   and@,x, rol@,x, and@l,x,
|| ...etc...


Makes life MUCH easier. :-)


Last edited by kc5tja on Wed Jan 23, 2008 8:52 pm, edited 1 time in total.

Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Wed Jan 23, 2008 8:50 pm 
Offline
User avatar

Joined: Fri Aug 30, 2002 1:09 am
Posts: 8521
Location: Southern California
True, but using the assembly-language runtime makes it run faster than doing it purely in Forth with >DOES. Hmmm... I think there's an assembly equivalent for >DOES , but it has been so long since I've used it that I can't remember. Time for review. My work hasn't taken me down this particular road in a few years.


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Wed Jan 23, 2008 8:53 pm 
Offline

Joined: Sat Jan 04, 2003 10:03 pm
Posts: 1706
You posted before I could finish my edit. Re-read my post above; I've added something to it.


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Wed Jan 23, 2008 8:55 pm 
Offline
User avatar

Joined: Fri Aug 30, 2002 1:09 am
Posts: 8521
Location: Southern California
Ah yes, even shorter.


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Thu Jan 24, 2008 6:35 pm 
Offline

Joined: Tue Nov 23, 2004 2:11 pm
Posts: 25
Location: London, UK
kc5tja wrote:
That's true only for the first time someone sees ADDR+ notation. You'll find that as its popularity spreads,

I wouldn't have said anything but I believe in voting for something you want to keep.

FWIW I prefer the old syntax, and not just for inertia's sake. Addressing modes are an 'intellectual convenience' I'd like to keep, as they are a mnemonic in themselves. (All-alphabetic mnemonics won't save you from composing an illegal combination.)

If I had to have something then instead of a paranoid ADDR+ notation I'd rather have an optional @ prefix that can be stripped out for older assemblers and merely suppresses the warning, like if((foo=bar)).


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Thu Jan 24, 2008 7:38 pm 
Offline

Joined: Sat Jan 04, 2003 10:03 pm
Posts: 1706
debounce wrote:
FWIW I prefer the old syntax, and not just for inertia's sake. Addressing modes are an 'intellectual convenience' I'd like to keep, as they are a mnemonic in themselves.


That is their syntactic point -- the idea is to make assembly language a bit more like a high-level language.

Quote:
(All-alphabetic mnemonics won't save you from composing an illegal combination.)


I vehemently disagree. If you know you're accessing direct page, you know a priori that the 'd' appears in the suffix. It becomes utterly automatic. Currently, the instruction LDA 5 is ambiguous in any 6502 or 65816 assembler. Is it direct-page? Well, it's not really an issue on the 6502, because direct page is zero page. But on the 65816, it matters immensely.

From inertia, any absolute address less than 256 is typically assembled as a direct page reference. If you want otherwise, you need to remember to prefix the address with (in ca65's case) A: (e.g., LDA A:5) to force an absolute addressing mode. There is a similar prefix for forcing a long-address too. Conversely, there is also a Z: prefix that forces direct-page as well. Many assemblers won't share this precise syntax, but the fact that they exist at all stands testament to the inherent ambiguity of the assembly syntax.

The RISC approach, where you have one opcode, one mnemonic, eliminates outright this ambiguity. Therefore, there is zero need for such prefixes, and utterly no need for assemblers to "guess" what addressing mode you really wanted to use in their absence. The only disadvantage is that it makes macros a wee bit less flexible.

Quote:
optional @ prefix that can be stripped out for older assemblers and merely suppresses the warning, like if((foo=bar)).


How can an optional @ prefix be stripped out? Can you give an example of how this would work?


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Fri Jan 25, 2008 1:17 am 
Offline
User avatar

Joined: Thu Mar 11, 2004 7:42 am
Posts: 362
GARTHWILSON wrote:
Hmmm... I think there's an assembly equivalent for >DOES , but it has been so long since I've used it that I can't remember.


I believe the word you are thinking of here is ;CODE (note the semicolon).


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Fri Jan 25, 2008 6:04 pm 
Offline

Joined: Tue Nov 23, 2004 2:11 pm
Posts: 25
Location: London, UK
kc5tja wrote:
debounce wrote:
FWIW I prefer the old syntax, and not just for inertia's sake. Addressing modes are an 'intellectual convenience' I'd like to keep, as they are a mnemonic in themselves.


That is their syntactic point -- the idea is to make assembly language a bit more like a high-level language.
Not sure what you're saying here. You mean "the idea [of bracket-and-comma] is..." or "The idea [of all-alphabetic] is..."? If the latter, then reducing mnemonics to a simple label for an opcode byte (and a width specifier for the following operand) pushes the assembly language closer to machine code. As evidenced by the Forth examples quoted earlier.

kc5tja wrote:
Quote:
(All-alphabetic mnemonics won't save you from composing an illegal combination.)


I vehemently disagree. If you know you're accessing direct page, you know a priori that the 'd' appears in the suffix. It becomes utterly automatic. Currently, the instruction LDA 5 is ambiguous in any 6502 or 65816 assembler. Is it direct-page? Well, it's not really an issue on the 6502, because direct page is zero page. But on the 65816, it matters immensely.

From inertia, any absolute address less than 256 is typically assembled as a direct page reference. If you want otherwise, you need to remember to prefix the address with (in ca65's case) A: (e.g., LDA A:5) to force an absolute addressing mode. There is a similar prefix for forcing a long-address too. Conversely, there is also a Z: prefix that forces direct-page as well. Many assemblers won't share this precise syntax, but the fact that they exist at all stands testament to the inherent ambiguity of the assembly syntax.
That's ambiguity, and I completely agree. What I meant was non-orthogonality. INC abs,Y is an illegal instruction (the only one that's bitten me.) It's just as easy to assume INCAY abs exists.

Quote:
The RISC approach, where you have one opcode, one mnemonic, eliminates outright this ambiguity. Therefore, there is zero need for such prefixes, and utterly no need for assemblers to "guess" what addressing mode you really wanted to use in their absence. The only disadvantage is that it makes macros a wee bit less flexible.

And the mnemonics aren't such great mnemonics any more. Some people work better with brackets and commas.
kc5tja wrote:
Quote:
optional @ prefix that can be stripped out for older assemblers and merely suppresses the warning, like if((foo=bar)).


How can an optional @ prefix be stripped out? Can you give an example of how this would work?

The @ prefixes an otherwise bare operand of an instruction to indicate it is intended as an absolute address. Any addressing mode notation # () [] , A: Z: is enough to show the intention (or non-intention) and @ is forbidden. Where @ is allowed and the assembler does not generate a warning (depending on the sophistication of its heuristics), @ is optional. @ has nothing to do with ADDR notation, but there's no reason why the two couldn't be used in conjunction (though I'd rather there was a shorter and less shouty symbol than "ADDR+".)

Assuming the assembler only implements @ notation:
Code:
       
        LDY     const   ; give warning
        LDY     @const  ; opcode AC or A4
        LDA     <addr   ; give warning as < is not an addressing mode specifier
        LDA     @<addr  ; opcode A5 as <addr less than 256
        LDA     A:addr  ; opcode AD
        LDA     #<addr  ; opcode A9
        STA     const   ; opcode 8D, unambiguous
        STA     @const  ; opcode 8D

@ notation is appropriate when the problem is framed as "# being left off instructions." ADDR notation is an appropriate response to "constants being used as addresses" (in which case STA const should give a warning.) I prefer @ notation as ADDR doesn't cover the case of an address being used as an address when it should have been used as a constant.

We'll see, I suppose, which syntax wins out.


Top
 Profile  
Reply with quote  
 Post subject: With tongue in cheek...
PostPosted: Fri Jan 25, 2008 6:44 pm 
Offline

Joined: Tue Nov 23, 2004 2:11 pm
Posts: 25
Location: London, UK
In the meantime, if you're stuck with an 'old' assembler, you could always implement the new syntax yourself:
Code:
LDAIM   MAC
        .byte   $A9
        .byte   .1
        EOM

LDAAB   MAC
        .byte   $A5
        .word   .1
        EOM
...

:D


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 35 posts ]  Go to page Previous  1, 2, 3  Next

All times are UTC


Who is online

Users browsing this forum: No registered users and 4 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: