Page 1 of 1
assembler mechanics question
Posted: Thu Sep 09, 2010 7:36 pm
by tano6502
Hello,
my question is about how 6510 assemblers resolve symbol
names and choose the right opcode when in doubt
between page zero and absolute addressing modes.
Look at this dummy example:
Code: Select all
org $0000
ep1: lda label4
ret
label1: resb 1
ep2: lda label3
ret
label2: resb 1
ep3: lda label2
ret
label3: resb 1
ep4: lda label1
ret
label4: resb 1
Interleaving code/data locations, to assembly lda instructions
assembler needs to know if labels are in page zero to
choose the best opcode, and to place labels it needs
to know if instructions before them are 1 or 2 bytes long.
How can assemblers do this in two passes?
Re: assembler mechanics question
Posted: Thu Sep 09, 2010 8:25 pm
by fachat
How can assemblers do this in two passes?
It can't.
If a label is defined _after_ the opcode, the assembler can only assume - in the first pass - that it is an absolute (two-byte) address and use an appropriate opcode to calculate opcode length.
xa for example (which I wrote :-) only uses zeropage addressing mode, when a) the label is defined before the opcode, and b) the label actually is in zeropage.
You could, though, enforce zeropage by using the "<" prefix to only use the lower byte of the address. That does not give an error though when the label is (maybe accidently) not in zeropage.
André
Re: assembler mechanics question
Posted: Thu Sep 09, 2010 10:07 pm
by Dr Jefyll
How can assemblers do this in two passes?
Some assemblers will make more than two passes, although I don't know how common this is.
-- Jeff
Posted: Thu Sep 09, 2010 10:48 pm
by GARTHWILSON
Some assemblers will make more than two passes, although I don't know how common this is.
I don't remember which assembler I was using, but I had one unusual project where it took quite a few passes to get rid of all the phase errors. It didn't really matter, as it took another 15 seconds or some tolerable amount of time.
Re: assembler mechanics question
Posted: Fri Sep 10, 2010 7:24 am
by fachat
How can assemblers do this in two passes?
I don't know of assemblers that do that (but looking at the other comments it seems some do), but TeX (the typesetting program) does it like that:
It layouts a dokument, calculates the page numbers etc, inserts them, layouts again, recalculates, layouts again, recalculate, layout, ... and so on until the layout (or the calculated data) does not change anymore.
IIRC sometimes it would even go into a loop, where one change would say increase the page number, after the next layout a change cause by that would happen that decreases the page number again and it would start anew. (I'm using page number only as example, TeX computes table of contents, reference tables, and a lot more, which may take much more place than a page number)
An assembler could actually do the same. Assemble, calculate the address labels, re-assemble with better optimization, re-calculate address labels (that could have changed again), re-assemble ... and so on until the output does not change anymore.
But that is not 2-pass.
André
Posted: Fri Sep 10, 2010 10:05 am
by BitWise
In my experience two passes works most of the time provided zero page memory data areas are defined at the start of the source.
WDC's assembler allows you to explicitly define whether an address will zero page (e.g. LDA <ZPVAR) or absolute (e.g LDA |ABSVAR) if you have a phasing error. Mine also allows this.
My assembler is normally two passes unless it detects the structured programming directives being used (e.g. IF/ELSE/ENDIF) then it switches to three. This is because it tries to generate the shortest code sequence for a conditional jump (e.g. BEQ addr if in range otherwise BNE *+5;JMP addr) but can't estimate the size of the code to be skipped until the second pass.
Re: assembler mechanics question
Posted: Fri Sep 10, 2010 10:45 am
by Ruud
How can assemblers do this in two passes?
Some assemblers will make more than two passes, although I don't know how common this is.
Mine simply loops until all labels are resolved. An unresolved label means there is an error.
Re: assembler mechanics question
Posted: Fri Sep 10, 2010 12:48 pm
by OwenS
How can assemblers do this in two passes?
I don't know of assemblers that do that (but looking at the other comments it seems some do), but TeX (the typesetting program) does it like that:
It layouts a dokument, calculates the page numbers etc, inserts them, layouts again, recalculates, layouts again, recalculate, layout, ... and so on until the layout (or the calculated data) does not change anymore.
IIRC sometimes it would even go into a loop, where one change would say increase the page number, after the next layout a change cause by that would happen that decreases the page number again and it would start anew. (I'm using page number only as example, TeX computes table of contents, reference tables, and a lot more, which may take much more place than a page number)
An assembler could actually do the same. Assemble, calculate the address labels, re-assemble with better optimization, re-calculate address labels (that could have changed again), re-assemble ... and so on until the output does not change anymore.
But that is not 2-pass.
André
The typical way is to assemble assuming best case (So, all variables zero page), then have the linker (or equiv.) enlarge unfitting instructions, then try again. Because you never shrink instructions, you can't end up in a stable loop.
Posted: Sat Sep 11, 2010 12:56 am
by tano6502
Yes, I know how Tex does

Thanks,
tano
Posted: Sun Sep 12, 2010 4:39 am
by teamtempest
This is only a concern if an instruction is used in a context where more than one address mode is valid. My assembler normally chooses direct page if it can completely evaulate the operand field expression on the first pass AND the high byte of the result matches the direct page. This value is zero unless the programmer has informed the assembler to use some other value (eg., on the 65816). Otherwise it chooses absolute mode (even if long mode is legal).
The assembler's default choice can be overridden by the programmer on an instruction-by-instruction basis, ie., use direct page instead of absolute or vice-versa. This is my alternative to multiple passes attempting to create the shortest code sequence
In your example my assembler by default would use absolute addressing for the instructions at ep1: and ep2: because it would not know the values of label4 or label3 when it reached ep1: and ep2:. It would use zero-page addressing for the instructions at ep3: and ep4: because it would recognize that label2 and label1 referenced zero page locations.
Posted: Wed Sep 15, 2010 2:16 am
by dclxvi
In some two-pass assemblers the "use absolute if any label(s) in the operand is not (yet) defined" rule is used to provide a way of forcing absolute addressing. For example:
Code: Select all
LDX #3
LDA LFF,X ; this is abs,X which accesses location $0102
LFF EQU $FF
EOR LFF,X ; this is zp,X and accesses location $0002
For this to work, the assembler must use the same width (zp or abs) for a given line in both passes, but it is surprisingly common for assemblers to fail to get this right, and subsequent addresses wind up getting calculated incorrectly. Thus, it's wise to carefully check the assembler output if you ever assemble any code (rather than data) at an address on the zero page.