Page 1 of 2
Wanted: An 65816 Assembler
Posted: Wed Dec 29, 2010 3:09 pm
by ElEctric_EyE
I'm looking to get into programming the 65816 and am looking for an easy to use program similar to M.Kowalski's 6502 assembler, that can at least create a .bin file from the assembly code, under a Windows environment...
I've searched the threads here @6502.org and found:
HXA by teamtempest and
ACME by Marco Baye
They don't appear to have a GUI like MK's assembler though...
Thanks for your help!
Posted: Wed Dec 29, 2010 5:14 pm
by BigEd
This doesn't help you if you're absolutely set on having some kind of GUI, but these command-line assemblers exist:
-
cc65 is available for windows
-
xa (xa65) easily be built for windows (using
cygwin or less easily(*) using
mingw)
-
wla-65816 is available for windows
That last one is the one used in these
tutorials for snes development, notable because it also gives a pointer to a debugging version of an emulator, which most likely does have a GUI.
I'm guessing that you'd only need the GUI for testing/emulation - or is there some crucial feature in a GUI for doing development? (Perhaps colourising the code? We've
discussed that before and not everyone finds it useful.) If you want a GUI which understands 65816, it's most likely to be SNES-related, I would think.
Cheers
Ed
(*) I used the gui installer for mingw, had to adjust my CMD path with
Code: Select all
set path=%path%;c:\mingw\bin
set path=%path%;c:\mingw\msys\1.0\bin
and then add a missing library with
Re: Wanted: An 65816 Assembler
Posted: Wed Dec 29, 2010 6:21 pm
by BigDumbDinosaur
I'm looking to get into programming the 65816 and am looking for an easy to use program similar to M.Kowalski's 6502 assembler, that can at least create a .bin file from the assembly code, under a Windows environment...
I've searched the threads here @6502.org and found:
HXA by teamtempest and
ACME by Marco Baye
They don't appear to have a GUI like MK's assembler though...
Thanks for your help!
Other than the package sold by WDC for a lot of money, I don't have any suggestions for a windowing assembler. What I did was write macros for the Kowalski simulator that synthesize some of the '816 functions. Code for it follows.
Code: Select all
;================================================================================
;
;W65C816S INSTRUCTION MACROS
;
; —————————————————————————————————————————————————————————
; These macros implement 65C02 & many W65C816S native mode
; instructions that are not recognized by the Kowalski ass-
; embler (e.g., STP & WAI).
; —————————————————————————————————————————————————————————
;
.if !.ref(brl)
;
brl .macro .ad ;long relative branch...
.if .ad ;won't work for forward...
.isize =3 ;branches because of forward...
.os =*+.isize ;address reference
.os =.ad-.os
.if .os > 32767
.error %1 + ": FORWARD BRANCH OUT OF RANGE"
.endif
.if .os < -32768
.error %1 + ": BACKWARD BRANCH OUT OF RANGE"
.endif
.byte $82
.word .os
.else
.error "INSTRUCTION REQUIRES TARGET ADDRESS"
.endif
.endm
;
cop .macro .op ;co-processor
.if .op > $ff
.error "SIGNATURE MUST BE $00 - $FF"
.endif
.byte $02,.op
.endm
;
jml .macro .ad ;JMP $bbahal (long JMP)
.byte $dc,<.ad, >.ad,.ad >> 16
.endm
;
jsl .macro .bk,.ad ;JSL $bbahal (long JSR)
.byte $22,<.ad, >.ad,.ad >> 16
.endm
;
jsx .macro .ad ;JSR (<addr>,X)
.byte $fc,<.ad, >.ad
.endm
;
mvn .macro .s,.d ;move next <sbnk>,<dbnk>
.if .s > $ff
.error "SOURCE BANK MUST BE $00 - $FF"
.endif
.if .d > $ff
.error "DESTINATION BANK MUST BE $00 - $FF"
.endif
.byte $54,.d,.s
.endm
;
mvp .macro .s,.d ;move prev <sbnk>,<dbnk>
.if .s > $ff
.error "SOURCE BANK MUST BE $00 - $FF"
.endif
.if .d > $ff
.error "DESTINATION BANK MUST BE $00 - $FF"
.endif
.byte $44,.d,.s
.endm
;
pea .macro .ad ;pea <addr>
.byte $f4,<.ad, >.ad
.endm
;
pei .macro .ad ;pei (<addr>)
.if .ad > $ff
.error "INDIRECT ADDRESS MUST BE $00 - $FF"
.endif
.byte $d4,.ad
.endm
;
phb .macro ;push data bank register
.byte $8b
.endm
;
phd .macro ;push direct page register
.byte $0b
.endm
;
phk .macro ;push program bank register
.byte $4b
.endm
;
plb .macro ;pull data bank register
.byte $ab
.endm
;
pld .macro ;pull direct page register
.byte $2b
.endm
;
rep .macro .op ;clear status register bits
.if .op > $ff
.error "OPERAND MUST BE $00 - $FF"
.endif
.byte $c2,.op
.endm
;
rtl .macro ;return long from subroutine
.byte $6b
.endm
;
sep .macro .op ;set status register bits
.if .op > $ff
.error "OPERAND MUST BE $00 - $FF"
.endif
.byte $e2,.op
.endm
;
stp .macro ;halt MPU
.byte $db
.endm
;
tcd .macro ;transfer .C to direct page register
.byte $5b
.endm
;
tcs .macro ;transfer .C to stack pointer
.byte $1b
.endm
;
tdc .macro ;transfer direct page register to .C
.byte $7b
.endm
;
tsc .macro ;transfer stack pointer to .C
.byte $3b
.endm
;
txy .macro ;transfer .X to .Y
.byte $9b
.endm
;
tyx .macro ;transfer .Y to .X
.byte $bb
.endm
;
wai .macro ;wait for interrupt
.byte $cb
.endm
;
xba .macro ;swap B & A accumulators
.byte $eb
.endm
;
xce .macro ;swap carry & emulation bits
.byte $fb
.endm
;
;
; synthesized stack-based accumulator instructions...
;
; —————————————————————————————————————————————————————————————————————
; Stack-based accumulator instructions take the form ***S or ***SI, the
; latter for indexed indirect operations. *** represents the parent
; instruction. For example, LDAS 1 is equivalent to LDA 1,S & LDASI 1
; is the equivalent of LDA (1,S),Y. The actual macro names are lower
; case. The macro comment indicates the official WDC assembly language
; syntax for the instruction being synthesized.
; —————————————————————————————————————————————————————————————————————
;
adcs .macro .of ;ADC <offset>,S
.if .of > $ff
.error "OFFSET MUST BE $00 - $FF"
.endif
.byte $63,.of
.endm
;
adcsi .macro .of ;ADC (<offset>,S),Y
.if .of > $ff
.error "OFFSET MUST BE $00 - $FF"
.endif
.byte $73,.of
.endm
;
ands .macro .of ;AND <offset>,S
.if .of > $ff
.error "OFFSET MUST BE $00 - $FF"
.endif
.byte $23,.of
.endm
;
andsi .macro .of ;AND (<offset>,S),Y
.if .of > $ff
.error "OFFSET MUST BE $00 - $FF"
.endif
.byte $33,.of
.endm
;
cmps .macro .of ;CMP <offset>,S
.if .of > $ff
.error "OFFSET MUST BE $00 - $FF"
.endif
.byte $c3,.of
.endm
;
cmpsi .macro .of ;CMP (<offset>,S),Y
.if .of > $ff
.error "OFFSET MUST BE $00 - $FF"
.endif
.byte $d3,.of
.endm
;
eors .macro .of ;EOR <offset>,S
.if .of > $ff
.error "OFFSET MUST BE $00 - $FF"
.endif
.byte $43,.of
.endm
;
eorsi .macro .of ;EOR (<offset>,S),Y
.if .of > $ff
.error "OFFSET MUST BE $00 - $FF"
.endif
.byte $53,.of
.endm
;
ldas .macro .of ;LDA <offset>,S
.if .of > $ff
.error "OFFSET MUST BE $00 - $FF"
.endif
.byte $a3,.of
.endm
;
ldasi .macro .of ;LDA (<offset>,S),Y
.if .of > $ff
.error "OFFSET MUST BE $00 - $FF"
.endif
.byte $b3,.of
.endm
;
oras .macro .of ;ORA <offset>,S
.if .of > $ff
.error "OFFSET MUST BE $00 - $FF"
.endif
.byte $03,.of
.endm
;
orasi .macro .of ;ORA (<offset>,S),Y
.if .of > $ff
.error "OFFSET MUST BE $00 - $FF"
.endif
.byte $13,.of
.endm
;
sbcs .macro .of ;SBC <offset>,S
.if .of > $ff
.error "OFFSET MUST BE $00 - $FF"
.endif
.byte $e3,.of
.endm
;
sbcsi .macro .of ;SBC (<offset>,S),Y
.if .of > $ff
.error "OFFSET MUST BE $00 - $FF"
.endif
.byte $f3,.of
.endm
;
stas .macro .of ;STA <offset>,S
.if .of > $ff
.error "OFFSET MUST BE $00 - $FF"
.endif
.byte $83,.of
.endm
;
stasi .macro .of ;STA (<offset>,S),Y
.if .of > $ff
.error "OFFSET MUST BE $00 - $FF"
.endif
.byte $93,.of
.endm
;
;
; 16 bit immediate mode instructions...
;
; ————————————————————————————————————————————————————————————————————
; Immediate mode instructions that are able to accept 16 bit operands
; take the form ***W, where *** is the parent 8 bit instruction. For
; example, ADCW is the 16 bit form of ADC. The actual macro names are
; lower case. It is the responsibility of the programmer to assure
; that MPU register sizes have been correctly configured before using
; ***W instructions. For example:
;
; LONGA ;16 bit .A & memory
; LDAW $1234 ;assembles as LDA #$1234
; SHORTA ;8 bit .A & memory
; LDAW $1234 ;won't work as expected!!!
;
; The macro comment indicates the official WDC assembly language syn-
; tax for the instruction being synthesized.
; ————————————————————————————————————————————————————————————————————
;
adcw .macro .op ;ADC #nnnn
adc #<.op
.byte >.op
.endm
;
andw .macro .op ;AND #nnnn
and #<.op
.byte >.op
.endm
;
bitw .macro .op ;BIT #nnnn
bit #<.op
.byte >.op
.endm
;
cmpw .macro .op ;CMP #nnnn
cmp #<.op
.byte >.op
.endm
;
cpxw .macro .op ;CPX #nnnn
cpx #<.op
.byte >.op
.endm
;
cpyw .macro .op ;CPY #nnnn
cpy #<.op
.byte >.op
.endm
;
eorw .macro .op ;EOR #nnnn
eor #<.op
.byte >.op
.endm
;
ldaw .macro .op ;LDA #nnnn
lda #<.op
.byte >.op
.endm
;
ldxw .macro .op ;LDX #nnnn
ldx #<.op
.byte >.op
.endm
;
ldyw .macro .op ;LDY #nnnn
ldy #<.op
.byte >.op
.endm
;
oraw .macro .op ;ORA #nnnn
ora #<.op
.byte >.op
.endm
;
sbcw .macro .op ;SBC #nnnn
sbc #<.op
.byte >.op
.endm
;
;
; register size macros...
;
; ————————————————————————————————————————————————————————————————————
; These macros are a convenient way to change the MPU's register sizes
; without having to remember the correct bit pattern for the REP & SEP
; instructions. The assembler itself has no awareness of whether 8 or
; 16 bit immediate mode operands are to be used. Therefore, it is up
; to the programmer to use the appropriate instructions.
; ————————————————————————————————————————————————————————————————————
;
longa .macro ;16 bit accumulator & memory
rep $20
.endm
;
longr .macro ;16 bit all registers
rep $30
.endm
;
longx .macro ;16 bit index registers
rep $10
.endm
;
shorta .macro ;8 bit accumulator & memory
sep $20
.endm
;
shortr .macro ;8 bit all registers
sep $30
.endm
;
shortx .macro ;8 bit index registers
sep $10
.endm
;
;
; INT pseudoinstruction — assembles as BRK followed by operand...
;
int .macro .op ;INT <intnum>
.if .op > $ff
.error "INTERRUPT NUMBER MUST BE $00 - $FF"
.endif
brk
.byte .op
.endm
;
.endif
.end
BRL forward branches won't work right due to unresolvable forward references. I may some day figure out a workaround.
The JML (JuMP long) and JSL (Jump to Subroutine Long) instructions process the operand as a 24 bit address, even if it is only 8 or 16 bits. For example:
will jump to address $3456 in bank $12. Similarly:
will call a subroutine in bank 0 (which is implied) at address $1234.
Understand that the Kowalski assembler internally resolves all values to 32 bits and provides a number of logical operators that can extract the most significant 16 bits. This is how I was able to synthesize the JML and JSL instructions. The assembler will only display the lower 16 bits in the listing output.
If you call a subroutine using JSL be sure to use RTL at the end of the subroutine to return. Otherwise, be prepared to press the reset button.
EDIT: I updated this macro list with some new ones added since I originally posted this.
EDIT: Macro list revised on 2012/07/01. In the near future this will be posted on my POC website.
Posted: Wed Dec 29, 2010 8:49 pm
by ElEctric_EyE
This doesn't help you if you're absolutely set on having some kind of GUI...
MK's assembler came so naturally to me, like I was using Micromon-64 all over again. Then I could save my code to a .bin file and burn an EEPROM to test my stand-alone system...
I am unfamiliar with command line, but
am very willing to learn if there is a comparable process to making a .bin file. I am envisioning making a .txt file, maybe renaming it, then having one of the aforementioned programs compile it to file. Am I close with my guess? Like I said, I'm definately all ears at this point.
Other than the package sold by WDC for a lot of money, I don't have any suggestions for a windowing assembler. What I did was write macros for the Kowalski simulator that synthesize some of the '816 functions. Code for it follows...
I saw WDC's SDK license. About $400?. Not TOO TOO bad...
But your macro approach to "synthesize" 65816 code using a 6502 assembler is pretty neat. I imagine there is a speed sacrifice though?
Thanks for sharing the info guys!
Posted: Wed Dec 29, 2010 8:56 pm
by BigEd
With xa I think it's just one line to assemble your source file (with whatever extension you find convenient, but *.txt is good) into a binary, and with the cc65 assembler you need two lines.
Put that one or two lines into a *.bat file and it's about as simple as it gets.
Cheers
Ed
Posted: Wed Dec 29, 2010 9:08 pm
by fachat
With xa I think it's just one line to assemble your source file (with whatever extension you find convenient, but *.txt is good) into a binary, and with the cc65 assembler you need two lines.
Put that one or two lines into a *.bat file and it's about as simple as it gets.
Cheers
Ed
like
where "XYZ.a65" is your text file and "XYZ.bin" is your target bin file. In the source you'd just do the usual
to set the address of the generated code for ROM binary generation.
There are many more options (like with all other assemblers), but this is basically it to generate code for a ROM.
André
Posted: Thu Dec 30, 2010 2:52 am
by BigDumbDinosaur
But your macro approach to "synthesize" 65816 code using a 6502 assembler is pretty neat. I imagine there is a speed sacrifice though?
It doesn't seem to have too much effect. Assembly time for the POC V1 ROM, which has about 7,000 lines of source code, is 2-3 seconds, and that's running on a relatively old machine with an AMD XP2000 processor from seven years ago.

Posted: Thu Dec 30, 2010 9:37 am
by BitWise
Or there is my 65xx family macro assembler (6502, 65c02, 65816, 65832) which is written in Java so it runs on any platform with a JRE.
http://www.obelisk.demon.co.uk/dev65/ for info
http://www.obelisk.demon.co.uk/6502/6502.zip for the code
Posted: Thu Dec 30, 2010 3:37 pm
by PontusO
For simple projects I tend to use AS, it can be downloaded from here
http://john.ccac.rwth-aachen.de:8000/as . The best thing with it is that it supports very many different processors and generates support for the noice debugger.
Posted: Fri Dec 31, 2010 1:25 am
by ElEctric_EyE
Thank you Bitwise & PontusO!
Will have time to check them out in a few days, but starting to think I will go the way of macro's and stick with M. Kowlaski's...
Only because I am used to it, it has the editor, assembler, & debugger and creates the .bin files I need for an EEPROM programmer.
I was not aware that was how macro's were used...
Posted: Fri Dec 31, 2010 2:20 am
by ElEctric_EyE
...I'm guessing that you'd only need the GUI for testing/emulation - or is there some crucial feature in a GUI for doing development?...
I like the fact, now comparing it to other assemblers, that it has an editor built in, so I don't have to create a .txt file with one program and run it through a compiler with another program.
Kowalski's is an all-in-one solution. Much quicker, especially for a poor programmer, like myself, where I have to maybe just change one variable in the entire code in order to "burn" the EEPROM, place it in system and test.
Posted: Fri Dec 31, 2010 3:09 am
by teamtempest
Code: Select all
;
; enhanced & native mode instructions...
;
brl .macro .op ;long relative branch
.if .op
.os =*+3
.os =.op-.os
.if .os > 32767
.error %1 + ": FORWARD BRANCH OUT OF RANGE"
.endif
.if .os < -32768
.error %1 + ": BACKWARD BRANCH OUT OF RANGE"
.endif
.byte $82
.word .os
.else
.error "INSTRUCTION REQUIRES TARGET ADDRESS"
.endif
.endm
BRL forward branches don't work right, although I don't see anything obviously wrong with the macro.
I'm not familiar with the Kowalski simulator or exactly how this macro fails. However I suspect it has to do with forward referencing '.op'. Since the value of the label is not known when the macro is assembled, the expression '.os = .op - .os' cannot be completely evaluated. Because of that, any following use '.os' cannot be evaluated either.
I made this an error in HXA - the expression following an '=' (or 'equ') has to resolve to a constant value when it is first encountered. One reason for doing so is illustrated by your macro. The amount of code generated by a macro has to be known so that the program counter can be updated properly.
Suppose this wasn't the case, and that macro expansion could be postponed until all the values used to control its expansion were known. But the value of a forward referenced label may be different depending on how much code a macro generates (ten bytes vs. thirty bytes, say). What would be the correct value to assign to the label when it is finally encountered?
Anyway, still without knowing exactly the error you're experiencing, it may be that if you give up range checking you can assemble this macro:
Code: Select all
brl .macro .op
.if .op
.os = *+3
.byte $82
.word .op - .os
.else
.error "INSTRUCTION REQUIRES TARGET ADDRESS"
.endif
.endm
This should work because although the '.op' in the expression following '.word' is not known, the final size of the code it generates is, so there's no difficulty in postponing its complete evaluation.
Incidentally, even if HXA didn't support the 65816, the macro could still be written this way:
Code: Select all
brl .macro ?op=@
.if "?op" != "@"
.byte $82
.rbit16 ?op
.else
.error "INSTRUCTION REQUIRES TARGET ADDRESS"
.endif
.endm
because the "rbit--" pseudo ops calculate a signed offset relative to the program counter and also range check them (in the case of ".rbit16", to a signed 16-bit value, -32768 to 32767). Just saying.
Posted: Fri Dec 31, 2010 12:30 pm
by fachat
Suppose this wasn't the case, and that macro expansion could be postponed until all the values used to control its expansion were known. But the value of a forward referenced label may be different depending on how much code a macro generates (ten bytes vs. thirty bytes, say). What would be the correct value to assign to the label when it is finally encountered?
When my xa assembler encounters (unresolved) forward references, it takes the maximum amount it could possibly have. If you try to
it would not generate a zeropage addressing mode but an absolute addressing mode for the STA - to fit any possible value into it.
But indeed, for pseudo opcodes like .dsb which take a length value as parameter, unresolved references are not allowed, as the maximum length is not know.
André
Damned Forward References!
Posted: Fri Dec 31, 2010 6:33 pm
by BigDumbDinosaur
Years ago, the was a BASIC compiler for Commodore eight bit computers called PetSpeed, whose output was pure M/L, not byte code like used with later compilers. If a user tried to LIST a PetSpeed compiled program, all he'd see was:
PetSpeed was a triple pass, optimizing compiler that used passes one and two to resolve forward references to the smallest possible data size. The actual code generation occurred on the third pass. The resulting programs were amazingly compact, which given the small amount of free RAM in the early Commodore PETs, was a necessity. The one real penalty of using PetSpeed (aside from its cost) was the time required for compilation.
Unfortunately, I have never encountered a triple pass assembler able to resolve forward references in that fashion.
Posted: Fri Dec 31, 2010 6:55 pm
by GARTHWILSON
Unfortunately, I have never encountered a triple pass assembler able to resolve forward references in that fashion.
I seem to remember using an assembler that would do as many passes as it took to get rid of all the phase errors.