6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Sat Nov 23, 2024 5:43 am

All times are UTC




Post new topic Reply to topic  [ 45 posts ]  Go to page Previous  1, 2, 3
Author Message
PostPosted: Wed Jun 15, 2016 5:04 pm 
Offline

Joined: Sat Dec 13, 2003 3:37 pm
Posts: 1004
I've yet to encounter phase errors. How do they occur? The only thing I can think of right now is something like:
Code:
* = $FD
LDA label
label .DW 1,2,3


The problem here being that we don't know yet whether label is ZP or not, and especially in this case, the answer depends on how the LDA is assembled.

I account for this right now by basically using ZP if I know at the time of the first pass that it's ZP. Since the majority of ZP is used for fixed locations that can be pre-defined in the code, this has not been a real problem. I could see later on simply adding a declaration to force the instruction to ZP if the developer "knows" this is going to be ZP, even if the assembler doesn't yet.

The way I do things now is pretty much rely on happenstance to get by. Simply, as the the instructions are assembled, if I know the answer at initial assembly time, then I use it and mark the reference as resolved. I do this during initial assembly. After initial assembly, my second phase is to go through and resolve the remaining references.

What I may be missing is detection of a nested self referential expression:
Code:
a = 1
b = c - a
c = b - a


I may endlessly loop here (until the stack implodes) right now, rather than having a first class detection of the reference. But that should be easy enough to fix.

The original goal of the assembler was to assemble the original FIG-Forth 6502 listing, which it does fine, and why I haven't run in to these other issues yet.

Do you have other examples of how I might get a phase error?


Top
 Profile  
Reply with quote  
PostPosted: Wed Jun 15, 2016 8:54 pm 
Offline
User avatar

Joined: Fri Aug 30, 2002 1:09 am
Posts: 8544
Location: Southern California
In simple assembly, most phase errors will, as you say, come from variables not being assigned before they're encountered in the code to know for example LDA ZP versus LDA absolute. There are other possible causes though.

A simple example might be that you have load an address as a literal and store it in a variable, something like:
Code:
        LDA  #<StringAddr    ; low byte
        STA  Foobar
        LDA  #>StringAddr    ; high byte
        STA  Foobar+1

but you decide that if the low byte is 00, you'll omit the LDA# and then replace STA with STZ to be more efficient; so you use conditional assembly, test the address, and possibly shorten it up. The first time through, if the string is further down, the assembler has not encountered it yet, so it defaults to 0, the LDA# is omitted, and STZ is assembled. Later, the string is encountered, and its low address byte happens to be $FE. On second pass, the LDA# gets put back in, so now the string starts at an address whose low byte is 00. As you can see, no number of passes will ever resolve it. (I put this in '02 code. On the '816 you'd probably do LDA#<16-bit_addr>, STA, and assuming the address is not 0000, there's no point in testing it.) If you're using strings in the assembly like the above, you'll probably have a bunch of them, and put the code in a macro to make invocations of it more concise, perhaps like:

Code:
        PUT  StringAddr, in, Foobar

I think I've only had the situation once in 6502 assembly where two or three passes weren't enough. It was 25 years ago and I can't remember anymore what special thing I was doing in a macro, but it made the source code that invoked it easier to handle and maintain, and I used it in over 30 places in the code, and it took that many passes to resolve it, because they referenced each other in a chain, values in each one depending on ones in the next, which depended on ones in the next, etc.. The assembler was fast enough that the large number of passes was not really a problem.

I've run into the phase-error problem a lot more in PIC16 work since it has mickeymousities like jump tables that cannot be allowed to straddle a 256-byte page boundary unless you use more instructions to make it work.

_________________
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?


Top
 Profile  
Reply with quote  
PostPosted: Wed Jun 15, 2016 9:24 pm 
Offline

Joined: Sat Dec 13, 2003 3:37 pm
Posts: 1004
Okay. Short term, I was simply not going to allow conditional assembly to work on expressions that could not be resolved. Dunno how far that will take me. My major motivator for conditional assembly and macros was to write Forth word headers using a macro.

Off the top of my head, I can't really picture how I would organize the code generation process to handle that situation. First thoughts would be to break the code stream in blocks of uncertainty. The most extreme case is each instruction is dependent on the previous instruction to get it's address. For example, a macro expansion would be a separate block of uncertainty. It depends on the previous block, and the next block would depend on the macro expansion. You then use the expressions to create a dependency tree for these blocks, do a quick dependency sort, and start resolving things.

As I said, I already sort of do this. But maybe I should defer macro expansion until resolution, rather than inject it straight in to the stream like I'm thinking right now. The only potential issue here is that a macro expansion can introduce a global label, but maybe that won't be a problem. I guess the trick be distinguishing between a simply label resolution and a code expansion.

I dunno, have to play with it I guess. It almost sounds similar to a linking problem to be honest.


Top
 Profile  
Reply with quote  
PostPosted: Wed Jun 15, 2016 11:18 pm 
Offline
User avatar

Joined: Fri Aug 30, 2002 1:09 am
Posts: 8544
Location: Southern California
Quote:
My major motivator for conditional assembly and macros was to write Forth word headers using a macro.

Here's mine, in my '816 ITC Forth:
Code:
HEADER:         MACRO NAME, precedence          ; Lay down name and link fields.

 IF HEADERS? && ! OMIT_HEADERS  ; If HEADERS is true and OMIT_HEADERS is false,
                                ; then go ahead lay down the header.
    last_NFA:   SETL    new_NFA
     new_NFA:   SETL    $
                DFB     precedence | {npc-$-1}, NAME
         npc:                           ; Use this to calc name length above.
        IF      $ & 1                   ; If next addr is odd,
                DFB     0               ; add a 0 byte before you
        ENDI
                DWL     last_NFA        ; lay down the link field.
 ELSE
        IF      $ & 1
                DFB     0               ; Even if headers are not allowed,
        ENDI                            ; you should still align.
 ENDI
                ENDM
 ;-------------------

SETL is "SET Label," like EQU but you can change it as many times as you like.
DFB is "DeFine Byte," like .DB in some assemblers.
DWL is "Define Word, Low byte first."
$ by itself gives the program counter value at that point.

(Cells are aligned in this kernel. The ANS decompiling word SEE works out simpler when the first byte of a cell has an even-numbered address. This is my only real reason for the alignment. DOES> compiles a null alignment byte after the JSR _does in the parent word so the CFA following it will be aligned. Interestingly, this also makes _does simpler.}

Here are some typical usages:
Code:
        HEADER "R>", NOT_IMMEDIATE      ; ( -- n )
R_FR:   PRIMITIVE
        PLA
        DEX_DEX                         ; like JMP PUSH
        PUT_TOS
 ;-------------------
        HEADER "R@", NOT_IMMEDIATE      ; ( -- n )
Rfetch: PRIMITIVE                       ; I uses this code too.
        LDA     1,S                     ; Stack-relative addressing.
        DEX_DEX                         ; These two lines are like JMP PUSH.
        PUT_TOS
 ;-------------------
        HEADER "OVER", NOT_IMMEDIATE    ; ( n1 n2 -- n1 n2 n1 )
OVER:   PRIMITIVE
        DEX_DEX
        LDA     4,X
        PUT_TOS
 ;-------------------
        HEADER "DROP", NOT_IMMEDIATE    ; ( n -- )
DROP:   DWL     POP                     ; CFA points to code at POP.
 ;-------------------
        HEADER "DUP", NOT_IMMEDIATE     ; ( n1 -- n1 n1 )
DUP:    PRIMITIVE
        DEX_DEX
        LDA     2,X
        PUT_TOS
 ;-------------------
        HEADER "?DUP", NOT_IMMEDIATE    ; ( 0 -- 0   |   n -- n n )
?DUP:   PRIMITIVE
        LDA     0,X
        BNE     DUP+2           ; Go to the body of DUP if cell was non-0.
        GO_NEXT                 ; Otherwise, move on without doing anything.
 ;-------------------

_________________
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?


Top
 Profile  
Reply with quote  
PostPosted: Thu Jun 16, 2016 1:01 am 
Offline

Joined: Mon Jan 07, 2013 2:42 pm
Posts: 576
Location: Just outside Berlin, Germany
I'm not going to be much help here, because the assembler can't do conditional assembly yet, and macros are really, really simple. Sorry.


Top
 Profile  
Reply with quote  
PostPosted: Fri Jul 01, 2016 9:24 pm 
Offline

Joined: Mon Jan 26, 2015 6:19 am
Posts: 85
scotws wrote:
I'm not going to be much help here, because the assembler can't do conditional assembly yet, and macros are really, really simple. Sorry.

A FORTH assembler can readily do conditional assembly.
Code:
: PUSHREGS, PHA,
65C02 @ IF
   PHX, PHY,
ELSE
   TXA, PHA, TYA, PHA,
THEN ;

(This assumes that tha accumulator doesn't need to be preserved).

Being a one pass assembler, the FORTH assembler has difficulty dealing with unknown variables (although it can be made to resolve the variables afterwards provided you know if it is a one or two byte variable). Being label-less helps in this regard.


Top
 Profile  
Reply with quote  
PostPosted: Sun Aug 28, 2016 1:03 pm 
Offline

Joined: Mon Jan 07, 2013 2:42 pm
Posts: 576
Location: Just outside Berlin, Germany
After dogfooding Tinkasm with Liara Forth snippets some more, I've streamlined a bunch of stuff. The string directives have been folded into the normal data directives, so we have
Code:
.byte "Kaylee", 0, .msb 8000, { var * 7 + 1 }
for instance (getting rid of a lot of messy code in the process). Single characters are now accepted as 'a', which was overdue. Obviously found a few more bugs. Probably not faster because it uses more regex, but if you're looking for speed, this is not the assembler for you anyway.

Also, learning basic Go (golang) has taught me the value of one of its "minor" features: It includes a formatting program (gofmt) that is run automatically by most IDEs (and of course the vim plugin) that makes sure that every Go program is formatted the exact same way, all the way down to the spaces (see https://blog.golang.org/go-fmt-your-code). It's hard to explain what a difference this makes - your mind begins to expect things to be in certain places, and reading code become so much easier. It's even better than Python. So at some point there will be a formatting tool included with Tinkasm. I'm still working out the indentation details, though.

More importantly, now that the opcode notation and directive selection are pretty stable, I'll be adding a conversion tool to generate other formats, at least semi-automatically. The first target obviously will be the official WDC syntax from the Manual, and probably As65. This should be pretty straightforward (he said, tempting the gods of coding), and will make it easier for people to use stuff written in Tinkasm.

I'll be sticking with the Python version for a while to come, at least until I've figured out how to include conditional assembly. When it's pretty feature complete, I'll see about a Go version for speed.


Top
 Profile  
Reply with quote  
PostPosted: Sun Aug 28, 2016 10:27 pm 
Offline

Joined: Mon Jan 07, 2013 2:42 pm
Posts: 576
Location: Just outside Berlin, Germany
So the basic formatter was quicker to write than expected. Even better, my secret hope has been realized: It seems to be good enough that you don't have to give a damn about indentation while typing, but can just let the machine take care of it afterwards (yes, I'm really that lazy). At the moment, we can go from
Code:
.mpu 65816
.origin 8000

.equ athena 01
.equ zeus 02
.equ poseidon 03

.native
loop lda.# 00 ; naught for all!
sta.x 1000
bra loop

.byte 01,    02 ; bad spaces!
.end
to
Code:
        .mpu 65816
        .origin 8000

        .equ athena 01
        .equ zeus 02
        .equ poseidon 03

        .native
loop            lda.# 00  ; naught for all!
                sta.x 1000
                bra loop

        .byte 01, 02  ; bad spaces!
        .end
automatically. Of course, that's not what it looks like in vim, because of the Tinkerer's Assembler plugin:
Attachment:
Screenshot_2016-08-29_00-17-04.png
Screenshot_2016-08-29_00-17-04.png [ 25.75 KiB | Viewed 1733 times ]

What is still missing (apart from some edge cases I've probably overlooked) is the block formatting for .equ and .byte directives. So by the time I'm done, the Greek gods should look like this:
Code:
        .equ athena   01
        .equ zeus     02
        .equ poseidon 03
But that's enough for today.


Top
 Profile  
Reply with quote  
PostPosted: Wed Aug 31, 2016 8:10 am 
Offline
User avatar

Joined: Sat Dec 07, 2013 4:32 pm
Posts: 246
Location: The Kettle Moraine
whartung wrote:
I've yet to encounter phase errors. How do they occur? The only thing I can think of right now is something like:
Code:
* = $FD
LDA label
label .DW 1,2,3


The problem here being that we don't know yet whether label is ZP or not, and especially in this case, the answer depends on how the LDA is assembled.

I account for this right now by basically using ZP if I know at the time of the first pass that it's ZP. Since the majority of ZP is used for fixed locations that can be pre-defined in the code, this has not been a real problem. I could see later on simply adding a declaration to force the instruction to ZP if the developer "knows" this is going to be ZP, even if the assembler doesn't yet.

I do exactly what you do. If the label has already been determined to be in zero page, then it's zero page. Else, it's assumed to not be.

My reasoning is that if you're going to put code in zero page, you can deal with your own optimisation, anyway. But if it's ever really an issue, my assembler allows you to do the equivalent of a Pascal FORWARD. That is, you can pre-define a label, so long as you know it's final address.

I do have a mechanism to use when you don't know the final address. I have .PH placeholders. They look just like data but they only increment the assembler's program counter, they don't output any bytes. If you don't issue an implicit address after them, all following addresses will be incorrect. Basically it allows me to set aside space for variables (especially in zero page) with calculated addresses without writing anything to the file.

As for Garth's situation, I simply don't substitute. If the programmer wants optimal code, he has to write it optimally. The only time I've ever made changes to the code is to insert an NOP to keep a JMP vector off of a page boundary. But since reducing to a two-pass assembler, I don't even do that anymore. The assembler will issue a warning if a case like that arrives.


Top
 Profile  
Reply with quote  
PostPosted: Wed Aug 31, 2016 8:56 am 
Offline
User avatar

Joined: Fri Aug 30, 2002 1:09 am
Posts: 8544
Location: Southern California
KC9UDX wrote:
I do exactly what you do. If the label has already been determined to be in zero page, then it's zero page. Else, it's assumed to not be.

Yep. I can't think of any time I did not get all the ZP declarations out of the way at the beginning of the main file, before any of the code starts. That way, in the code, all references to anything that has not yet been defined is automatically known to be non-ZP.

_________________
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?


Top
 Profile  
Reply with quote  
PostPosted: Wed Aug 31, 2016 11:04 am 
Offline

Joined: Mon Jan 07, 2013 2:42 pm
Posts: 576
Location: Just outside Berlin, Germany
Formatter works, though needs more testing. Would have gotten it done a day sooner, except I tried to be clever and handle normal labels and labels before data blocks in the same pass. Note to self: You're not that clever. :shock: Just take it step by step.

Next step (apart from more dogfooding) is the converter. But I'm afraid real life is making demands again ...


Top
 Profile  
Reply with quote  
PostPosted: Wed Aug 31, 2016 1:37 pm 
Offline
User avatar

Joined: Sat Dec 07, 2013 4:32 pm
Posts: 246
Location: The Kettle Moraine
I handle all labels in the first pass of two. All addresses are resolved in the first pass, and labels are addressed accordingly. That's really the only reason for that pass. If it weren't for labels, an assembler could work in one pass. Or, if labels were all defined in advance, everything could be done in one pass, like Pascal.


Top
 Profile  
Reply with quote  
PostPosted: Thu Jan 19, 2017 1:17 pm 
Offline

Joined: Mon Jan 07, 2013 2:42 pm
Posts: 576
Location: Just outside Berlin, Germany
After dogfooding the assembler with enough Liara Forth code to produce about 2k of binary output, I've extensively rewritten it. The main new feature is that the code listing now tells you what it thinks the register width is for any given instruction of the 65816. For example, let's say that a coder named Silly Scot forgot to put a size switch between an 8-bit character routine and a 16-bit number routine:
Attachment:
silly_test.png
silly_test.png [ 30.93 KiB | Viewed 1629 times ]
The assembler has no idea that it was supposed to go to 16-bit for the number, and creates an 8-bit immediate assignment. Strange effects ensue, much to the confusion of Silly Scot. He checks the listing:
Code:
  10:000 | DONE lbl | na  8  8 | 00e002 |             | stuff_with_strings                   
  11:000 | DONE ins | na  8  8 | 00e002 | e2 20       |                 sep 20               
  11:001 | DONE ctl | na  8  8 |        |             |         .!a8
  12:000 | DONE ins | na  8  8 | 00e004 | ad 00 ff    |                 lda 00ff00           
  13:000 | DONE ins | na  8  8 | 00e007 | 8d 00 d0    |                 sta 00d000           
  14:000 | DONE ins | na  8  8 | 00e00a | 60          |                 rts                 
  15:000 | DONE wsp | na  8  8 |        |             |
  16:000 | DONE lbl | na  8  8 | 00e00b |             | stuff_with_numbers                   
  17:000 | DONE ins | na  8  8 | 00e00b | a9 11       |                 lda.# 02211         
  18:000 | DONE ins | na  8  8 | 00e00d | 8d 00 c0    |                 sta 00c000           
  19:000 | DONE ins | na  8  8 | 00e010 | 60          |                 rts
Now, even Silly Scot can see what he did wrong! After inserting a .a16 directive, we get:
Code:
  10:000 | DONE lbl | na  8  8 | 00e002 |             | stuff_with_strings                   
  11:000 | DONE ins | na  8  8 | 00e002 | e2 20       |                 sep 20               
  11:001 | DONE ctl | na  8  8 |        |             |         .!a8
  12:000 | DONE ins | na  8  8 | 00e004 | ad 00 ff    |                 lda 00ff00           
  13:000 | DONE ins | na  8  8 | 00e007 | 8d 00 d0    |                 sta 00d000           
  14:000 | DONE ins | na  8  8 | 00e00a | 60          |                 rts                 
  15:000 | DONE wsp | na  8  8 |        |             |
  16:000 | DONE lbl | na  8  8 | 00e00b |             | stuff_with_numbers                   
  17:000 | DONE ins | na  8  8 | 00e00b | c2 20       |                 rep 20               
  17:001 | DONE ctl | na 16  8 |        |             |         .!a16
  18:000 | DONE ins | na 16  8 | 00e00d | a9 11 22    |                 lda.# 02211         
  19:000 | DONE ins | na 16  8 | 00e010 | 8d 00 c0    |                 sta 00c000           
  20:000 | DONE ins | na 16  8 | 00e013 | 60          |                 rts
And all is well, and Silly Scot rejoices.

Actually, under the hood, I've pretty much rewritten the whole thing. The original version was based on the assumption that Python would be slow, and tried to get rid of as much information as possible from pass to pass. But thanks to that nice Mr. Moore, assembling what I have so far in Liara Forth is pretty speedy, at least on my machine. So instead, this version tries to keep as much information as possible. Still speedy:
Code:
Code listing for file liaraforth.tasm
Generated on Thu Jan 19 12:12:22 2017
Target MPU: 65816
External files loaded: 4
Number of passes executed: 35
Number of steps executed: 10
Assembly time: 0.04607 seconds
Code origin: 006000
Bytes of machine code: 1998
I think I can live with 0.046 seconds for 2k of binary. The added information makes the listing a lot more informative (and easier to write). Still missing is the S28 code, that's next, but Reality (TM) is intruding again, so that make take a while.


Top
 Profile  
Reply with quote  
PostPosted: Thu Jan 19, 2017 1:23 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10986
Location: England
Hmm, 20 assembly runs per second... should be enough.

It's a nice realisation that code doesn't have to be super-clever, and can instead be simpler or more clear. I tend towards the premature optimisation - perhaps it's a bad habit which comes from the days of much slower systems. Or just a bad habit which comes from liking puzzles, rather than finishing projects!


Top
 Profile  
Reply with quote  
PostPosted: Thu Jan 19, 2017 1:32 pm 
Offline
User avatar

Joined: Sat Dec 07, 2013 4:32 pm
Posts: 246
Location: The Kettle Moraine
In my line of work, code must always be as easy to diagnose as possible, even at the cost of being very speed and memory inefficient. It bugs me, but at least I don't normally have to write it.


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 45 posts ]  Go to page Previous  1, 2, 3

All times are UTC


Who is online

Users browsing this forum: Google [Bot] and 35 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: