Yet anther assembler...
Yet anther assembler...
Largely for my own amusement, I am writing a 6502 assembler and I'm trying to incorporate as many 'optional' features as possible, so that it will build as large a code base as possible with a minimum of text changes.
Thus, it will accept (e.g.) db and byte as symonyms, and inputs to db/byte include hex, decimal, binary in any of the obvious flavours - 0xnnnn, 0nnnnh, $nnnn, 0bnnnn, 0nnnnb and so on, and also 'c' characters or "strings" (slight bug there regarding commas still to be resolved!).
The intent is truly old school; the initial version will accept only 6502N code, with 65C02 options next on the list, and it produces Intel hex, binary blob, and a list file as default outputs - absolute positions, no relocatable code (that'll get some downvotes!). This means it's easy to use for e.g. single board computers without an OS.
However, what I can't find - and I've read so many documents on this - is how a zero page address is defined. At the moment I have followed what seems to be the default, automatically choosing zero page if the evaluated expression is both less than 0x100 and the evaluation is trusted; otherwise, it's an absolute.
Any thoughts/preferences regarding this behaviour?
Neil
Thus, it will accept (e.g.) db and byte as symonyms, and inputs to db/byte include hex, decimal, binary in any of the obvious flavours - 0xnnnn, 0nnnnh, $nnnn, 0bnnnn, 0nnnnb and so on, and also 'c' characters or "strings" (slight bug there regarding commas still to be resolved!).
The intent is truly old school; the initial version will accept only 6502N code, with 65C02 options next on the list, and it produces Intel hex, binary blob, and a list file as default outputs - absolute positions, no relocatable code (that'll get some downvotes!). This means it's easy to use for e.g. single board computers without an OS.
However, what I can't find - and I've read so many documents on this - is how a zero page address is defined. At the moment I have followed what seems to be the default, automatically choosing zero page if the evaluated expression is both less than 0x100 and the evaluation is trusted; otherwise, it's an absolute.
Any thoughts/preferences regarding this behaviour?
Neil
Re: Yet anther assembler...
It's normal, but not ideal, to detect ZP as you say. Some assemblers have some syntax to help - but of course every assembler is different. It looks like ca65 has some idea of the 'size' of a value, and a byte-sized value is an appropriate ZP address. Other heuristics too:
https://cc65.github.io/doc/ca65.html#ss5.2
https://cc65.github.io/doc/ca65.html#ss5.2
Re: Yet anther assembler...
barnacle wrote:
However, what I can't find - and I've read so many documents on this - is how a zero page address is defined. At the moment I have followed what seems to be the default, automatically choosing zero page if the evaluated expression is both less than 0x100 and the evaluation is trusted; otherwise, it's an absolute.
Any thoughts/preferences regarding this behaviour?
Any thoughts/preferences regarding this behaviour?
The assembler for my 65020 does things the complicated way - it will do as many passes as it needs for the symbol values to become stable. It starts out assuming one byte for every symbol. If that assumption is incorrect, it will expand to two bytes and trigger another pass. It will never shrink a two byte value to one byte, so it's guaranteed to converge eventually. This allows forward branches to use one byte offsets most of the time, and two bytes when necessary. It was a lot easier to implement than I feared it might be.
Re: Yet anther assembler...
This is the approach I've taken. My expression evaluation is looking at both literal values and symbol values, and the symbol value, if not yet defined, returns an 'untrusted' flag as part of its return value. That flag is propagated all the way up the expression chain, so when I eventually get the value for the target address I know whether I can trust it. (I can also set a symbol as untrusted, if it uses a forward reference).
Only if I can trust it do I decide whether it fits in a zero page address, otherwise it's a three-byte absolute instruction. That means no more than two passes, at the possible risk of a slightly bigger code than might be required.
As an aside: HIGH vs HI vs > vs LOW vs LO vs < ?
And logic: not yet implemented by the expression parser, but I think &, ^, and | with the first two having the same precedence as * and / and the latter the same precedence as + and - ?
Neil
Only if I can trust it do I decide whether it fits in a zero page address, otherwise it's a three-byte absolute instruction. That means no more than two passes, at the possible risk of a slightly bigger code than might be required.
As an aside: HIGH vs HI vs > vs LOW vs LO vs < ?
And logic: not yet implemented by the expression parser, but I think &, ^, and | with the first two having the same precedence as * and / and the latter the same precedence as + and - ?
Neil
Re: Yet anther assembler...
John West wrote:
There is one trap to watch out for though - if you have a two pass assembler, and a symbol that's defined after it is used, the first pass will have to assume that it's 16 bit. If it later turns out to fit in 8 bits, you can't change your mind without invalidating every label after that first use...It will never shrink a two byte value to one byte, so it's guaranteed to converge eventually.
Quote:
As an aside: HIGH vs HI vs > vs LOW vs LO vs < ?
What language are you using for the assembler? In Python it was really easy to set up functions like the following and let users add their own functions, which is about as powerful as an assembler could ever get (assuming of course you're not averse to having to include a custom Python file in your project source.)
Code: Select all
def left(arg1,arg2): return arg1[0:int(arg2)]
def right(arg1,arg2): return arg1[-int(arg2):]
def hi(arg1): return (int(arg1)>>8)&0xFF
def lo(arg1): return int(arg1)&0xFF
def concat(arg1,arg2): return arg1+arg2
def substr(arg1,arg2,arg3): return arg1[int(arg2):int(arg3)]
def lower(arg1): return arg1.lower()
def upper(arg1): return arg1.upper()
def to_int(arg1): return int(float(arg1))
#text name of function, number of arguments, function
commandlist=[('left',2,left),
('right',2,right),
('hi',1,hi),
('lo',1,lo),
('concat',2,concat),
('substr',3,substr),
#('lower',1,lower),
#Alternately, define the function inline
('lower',1,lambda arg1: arg1.lower()),
('upper',1,upper),
#Built in functions
('int',1,to_int),
('float',1,float),
#These change type and need to be handled in the main program
('alpha',1,0),
('str',1,0),
('char',1,0),
#ADD CUSTOM FUNCTIONS HERE:
#example(x,y) = x+2*y
('example',2,lambda x,y:int(x)+int(y)*2)]Re: Yet anther assembler...
Druzyek wrote:
Do you think it happens often in practice that an 8 bit symbol becomes 16 bit but could be shrunk back to 8 bit after other symbols are resolved?
Re: Yet anther assembler...
Hi!
In practice I would expect it to never happen. In theory, I'm not sure if it's even possible. I remember having a discussion about it with friends back in the day, but don't remember the conclusion. Most of my reason for doing it this way is so I don't have to think about it
Assuming address rolls:
Or, if your assembler uses more than 16 bits of addresses, try:
John West wrote:
Druzyek wrote:
Do you think it happens often in practice that an 8 bit symbol becomes 16 bit but could be shrunk back to 8 bit after other symbols are resolved?
Code: Select all
org $FFFD
lda X
X:
brk
Code: Select all
org $FFFD
lda X & $FFFF
X:
brk
Re: Yet anther assembler...
Druzyek wrote:
Quote:
As an aside: HIGH vs HI vs > vs LOW vs LO vs < ?
What language are you using for the assembler? In Python it was really easy to set up functions like the following and let users add their own functions, which is about as powerful as an assembler could ever get (assuming of course you're not averse to having to include a custom Python file in your project source.)
So old school it's written in C. I can just about find my way through a Python script, but not well enough to write an assembler in it.
Neil
- GARTHWILSON
- Forum Moderator
- Posts: 8773
- Joined: 30 Aug 2002
- Location: Southern California
- Contact:
Re: Yet anther assembler...
I have a request list for anyone who writes an assembler, at http://wilsonminesco.com/AssyDefense/as ... uests.html .
About defining ZP addresses & addressing modes: This topic is relevant: "Assembler that automatically select what to put in ZP"
Rather than stop at two passes, make it keep going until there are no more phase errors. I had a valid situation 20 years ago that required about 30 passes. The amount of time the assembler takes to do the job is not a problem with modern PCs' speed. I don't remember the situation, but it was not 8- versus 16-bit addresses, but rather that there were many forward chained references that depended on each other. Variables should normally be declared before they're encountered anyway, meaning it should be known the first time whether they're in ZP or not.
I don't know what you're thinking 'old school' is, but I was introduced to macros in about 1987 by a neighbor who was into digital more than I was, and who had been using them for years. I quickly became a macro junkie.
About defining ZP addresses & addressing modes: This topic is relevant: "Assembler that automatically select what to put in ZP"
Rather than stop at two passes, make it keep going until there are no more phase errors. I had a valid situation 20 years ago that required about 30 passes. The amount of time the assembler takes to do the job is not a problem with modern PCs' speed. I don't remember the situation, but it was not 8- versus 16-bit addresses, but rather that there were many forward chained references that depended on each other. Variables should normally be declared before they're encountered anyway, meaning it should be known the first time whether they're in ZP or not.
Quote:
Macros? I wasn't, to be honest, even thinking about including macros. Like I said, old school...
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?
Re: Yet anther assembler...
One of those YMMV things, I guess - I've used macros in C, but as far as I can recall *never* in assembler, for any processor. I can see the appeal of a straight-forward define/replace text option but I do like to see on the list file exactly what I'm getting. Then again, I'm not intending to create relocateable code.
Old school for me is mid seventies, loading assembler and source code from cassette tape.
Neil
p.s. your 'if you write an assembler' page is open in another window. Those points with which I agree I am implementing
Old school for me is mid seventies, loading assembler and source code from cassette tape.
Neil
p.s. your 'if you write an assembler' page is open in another window. Those points with which I agree I am implementing
Re: Yet anther assembler...
The first macro assembler I used was PMA - Prime Macro Assembler in 1980. I don't recall what macro assembler I used on the Apple II. but on the BBC Micro with it's 2-pass assembler built into BASIC you just called a BASIC function/procedure to implement a macro...
Macros, even at the simplest level are very useful. Especially if you-re in-lining repetitive code. So for example in a VM I'm playing with, many of the instructions I'm interpreting require a copy to take place, so I have a macro:
This simply copies a value from "regA" to "regB" which could be done via subroutine, but I care more for speed than code density.
Another example - more for the 65816 is switching modes:
Parametrised macros can be very powerful indeed.
block move (negative) in the 65816:
and so on. I don't think I'd be without a macro assembler these days.
Of-course writing the assembler is best left to those who know
Especially when it comes to temporary or local labels that you might need at times...
Cheers,
-Gordon
Macros, even at the simplest level are very useful. Especially if you-re in-lining repetitive code. So for example in a VM I'm playing with, many of the instructions I'm interpreting require a copy to take place, so I have a macro:
Code: Select all
.macro pushAB
lda regA+0
sta regB+0
lda regA+2
sta regB+2
.endmacro
Another example - more for the 65816 is switching modes:
Code: Select all
; n816: e6502
; Enter Native 65816, or emulated 6502 modes.
;********************************************************************************
.macro n816
clc
xce
.endmacro
.macro e6502
sec
xce
.endmacro
block move (negative) in the 65816:
Code: Select all
; bmn
; Block move macro
.macro bmn len,from,to
lda #len-1
ldx #(from & $FFFF)
ldy #(to & $FFFF)
mvn (from & $FF0000),(to & $FF0000)
.endmacro
Of-course writing the assembler is best left to those who know
Cheers,
-Gordon
--
Gordon Henderson.
See my Ruby 6502 and 65816 SBC projects here: https://projects.drogon.net/ruby/
Gordon Henderson.
See my Ruby 6502 and 65816 SBC projects here: https://projects.drogon.net/ruby/
- GARTHWILSON
- Forum Moderator
- Posts: 8773
- Joined: 30 Aug 2002
- Location: Southern California
- Contact:
Re: Yet anther assembler...
barnacle wrote:
but I do like to see on the list file exactly what I'm getting.
Quote:
Old school for me is mid seventies, loading assembler and source code from cassette tape.
Quote:
your 'if you write an assembler' page is open in another window. Those points with which I agree I am implementing 
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?
Re: Yet anther assembler...
barnacle wrote:
... but I do like to see on the list file exactly what I'm getting.
GARTHWILSON wrote:
Every macro assembler I've seen shows exactly what you're getting from the macros unless you tell it not to.
- xa (xa65) does not do list files at all
- crasm prints only the assemled bytes of an expanded macro in hex (no inlined source or disassembly)
- acme does the same but only shows the first couple of bytes followed by an ellipsis
- ca65 (cc65) only has listing output for it's relocatables, so you only see placeholder addresses ("rr rr")
- GARTHWILSON
- Forum Moderator
- Posts: 8773
- Joined: 30 Aug 2002
- Location: Southern California
- Contact:
Re: Yet anther assembler...
hmn wrote:
barnacle wrote:
... but I do like to see on the list file exactly what I'm getting.
GARTHWILSON wrote:
Every macro assembler I've seen shows exactly what you're getting from the macros unless you tell it not to.
- xa (xa65) does not do list files at all
- crasm prints only the assemled bytes of an expanded macro in hex (no inlined source or disassembly)
- acme does the same but only shows the first couple of bytes followed by an ellipsis
- ca65 (cc65) only has listing output for it's relocatables, so you only see placeholder addresses ("rr rr")
- 2500AD
- Cross-32 (C32) originally from Universal Cross Assemblers
- MPASM from Microchip (for PIC microcontrollers, not 65xx)
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?
- BigDumbDinosaur
- Posts: 9425
- Joined: 28 May 2009
- Location: Midwestern USA (JB Pritzker’s dystopia)
- Contact:
Re: Yet anther assembler...
GARTHWILSON wrote:
I've used three commercial ones, and they all show, in the .lst file, exactly what the macro expansion produces:
- 2500AD
- Cross-32 (C32) originally from Universal Cross Assemblers
- MPASM from Microchip (for PIC microcontrollers, not 65xx)
The assembler in the Kowalski simulator also shows the results of macro expansion in the listing file. In fact, all of the 6502-family assemblers I've used do that. In my opinion, a symbolic assembler that doesn't give you the option of displaying all the gory details in the listing is not a real assembler.
Last edited by BigDumbDinosaur on Wed Aug 21, 2024 2:14 am, edited 1 time in total.
x86? We ain't got no x86. We don't NEED no stinking x86!