6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Wed May 01, 2024 10:02 pm

All times are UTC




Post new topic Reply to topic  [ 7 posts ] 
Author Message
PostPosted: Tue Jan 06, 2015 12:38 am 
Offline

Joined: Mon Jan 07, 2013 2:42 pm
Posts: 576
Location: Just outside Berlin, Germany
So since everybody keeps talking about how trivial it is to write an assembler in Forth, I tried it myself. Introducing "A Typist's 65c02 Assembler in Forth" (https://github.com/scotws/tasm65c02), a cross-assembler in gforth with labels and a modified syntax aimed at ten-finger typists. This is the first BETA version (and my first time on GitHub).

Background
I need the practice in more complex Forth, and I'm probably going to have to write my own 65816 assembler at some point anyway, so I thought I'd start out with something I know. The program is brute force in its approach: Whereas Forth assemblers traditionally aim for the smallest possible memory footprint, this one assumes lots of RAM and processing power, but should make adapting it to a different processor easy. Also, the syntax tries to be far more user-friendly.

Syntax
Because of the focus on using as little memory as possible, Forth assemblers tend to have strange syntaxes. Take this well-known version from William Raqsdale (1982, http://www.forth.org/fd/FD-V03N5.pdf):
Code:
               .A ROL,
              1 # LDY,
          DATA ,X STA,
          DATA ,Y CMP,
             6 X) ADC,
         POINT )Y STA,
         VECTOR ) JMP,
That's bizarre with brackets closing that were never opened and commas on the wrong side of letters. I always felt that this was too alien and too hard to read. Actually, I've never liked conventional assembler syntax anyway because all those "$" and "(" require shift keys and make it hard (or at least slower) for ten-finger typists. So if I was going to have to roll my own syntax, I decided I might as well aim for a "typist friendly" variant. Hence the name.

Everything (well, almost everything) in the Typist's Assembler is lower-case and the addressing modes are added to the opcode after a dot (the "tail"), with Absolute Mode being the "tailless" version. This gives us:
Code:
    implied                   dex                    dex
    accumulator               inc                    inc.a
    absolute                  lda $1000         1000 lda
    immediate                 lda #$00            00 lda.#
    absolute x indexed        lda $1000,x       1000 lda.x
    absolute y indexed        lda $1000,y       1000 lda.y
    absolute indirect         jmp ($1000)       1000 jmp.i
    indexed indirect          jmp ($1000,x)     1000 jmp.xi
    zero page                 lda $10             10 lda.z
    zero page x indexed       lda $10,x           10 lda.zx
    zero page y indexed       lda $10,y           10 lda.zy
    zero page indirect        lda ($10)           10 lda.zi
    zp indirect x indexed     lda ($10,x)         10 lda.zxi
    zp indirect y indexed     lda ($10),y         10 lda.ziy
    relative                  bne $2000         2000 bne
There is one special case: Because AND is also a Forth word, its Absolute Addressing opcode gets a dot as "and." We don't need dollar signs because Forth uses HEX and DECIMAL and whatnot. Note the "i" for indirect mode mirrors the placement of the bracket in the conventional syntax. A small loop example:
Code:
              lda #$00                        00 lda.#
              tax                                tax
    loop1:                        -> loop1
              sta $1000,x                   1000 sta.x
              dex                                dex
              bne loop1                    loop1 bne
Operand comes before opcode as usual in Forth; formatting alignes the opcode body in a "column". The alignment is the part that takes the most getting used to so far -- at some point I'll bite the bullet and set up the correct vi functions to automate this. Note that "lda.#" doesn't violate the "no uppercase" rule because it is lowercase on a German keyboard. YKMV.

Labels
Forth assemblers have historically tried to avoid labels because of the space thing. As Brad Rodriguez puts it in his (absolutely invaluable, don't-try-this-at-home-without-it) articles on Forth assemblers (http://www.bradrodriguez.com/papers/tcjassem.txt):
Quote:
(F)orth assemblers favor label-free, structured assembly code for a pragmatic reason: in Forth, it's simpler to create assembler structures than labels! The structures commonly included in Forth assemblers are intended to resemble the programming structures of high-level Forth.
Except that I really like labels. It turns out that "backward references" (like the one in the loop above) are trivial, but that for "single-pass assemblers" like this one, "forward references" are a major pain. I actually ended up reading a book on this (Assemblers and Loaders, David Salomon 1993, http://www.davidsalomon.name/assem.adve ... semAd.html) and then figuring out how to do single-linked lists in Forth. That part was not quite as trivial.

The result is not optimal, but this is as far as I can go with my current skill level. Backward references just get the label introduced with "->" as above. Forward references have to distinguish between jumps and branches. They are prefixed with a special command, either "j>" or "b>" (yes, those are upper case, but they are the easiest to see).
Code:
             j>  frog jsr
                      nop
             b>  dogs bra
                      nop
        -> dogs
                      brk
        -> frog
                      inc
                 dogs bra
(Yes, I know that code won't run.) Note that once the label is defined, we revert to "normal" use of labels. What happens under the hood is that each j> and b> adds an entry in that label's list and a dummy value in the assembled machine code. When the label is reached, that list is unwound, the dummy values are replaced, and the definition is replaced by a simple, new one. I tried to write the code in such a way that expanding the system for 65816 references will hopefully be fairly easy.

Usage
In gforth, you load the program and then the file you want to assemble.
Code:
        include tasm65c02.fs
        include example.fs
Note the .fs file type -- as far as the machine is concerned, this is a Forth file, not assembler. If everything goes right, you end up with ( addr u ) as the location and size of the machine code on the stack. DUMP will print this on the screen, and there is a SAVE <filename> assembler command you can use. There are a few more commands; see MANUAL.txt and the files example.fs and rom.fs for details.

Problems
The syntax breaks the standard for just about every assembler ever, and the "column" alignment can be fussy without editor support (pending). The forward reference code in its current state is non-elegant. There is currently no way to include external assembler files. The program is fairly large for what it does (13 kb with comments), and relies on the hardware for speed. The program uses some gforth specific Forth words such as NEXTNAME not available in ANSI Forth.

Conclusion
Except for the part with the forward references and list structures (which I assume real computer people learn in college), writing the assembler with this brute-force approach was in fact downright trivial, just as advertised. I can see how somebody who actually knows what he or she is doing can write them in a few hours. I would strongly recommend anybody who is interested in Forth tries this themselves, even if it is only for the "wow" effect. Though I've just started dogfooding Typist's Assembler, I can already say that programming 65c02 assembler in Forth (which is what this amounts to) is a whole different game. Normal "macro" functions seem primitive and having to put all the commands in separate lines restricting. My appreciation for Forth has grown enormously.

(Thanks again to Brad for the great article which gave me the push to try this, even if I ended up doing things differently.)


Top
 Profile  
Reply with quote  
PostPosted: Tue Jan 06, 2015 5:03 am 
Offline

Joined: Sun Apr 10, 2011 8:29 am
Posts: 597
Location: Norway/Japan
I like the idea of a typist's assembler, being a typist myself. Great idea! Of course, it's difficult to avoid the 'shift' key for everyone - on my keyboard layout, for example, the # character is shift-3. But I'll keep this idea in mind when I write my own assembler, which I've been wanting for quite a while.

-Tor


Top
 Profile  
Reply with quote  
PostPosted: Wed Jan 07, 2015 2:05 am 
Offline

Joined: Wed Jan 08, 2014 3:31 pm
Posts: 573
Interesting idea. It's a shame programming languages aren't designed with typing in mind. The C/C++ programming languages ruined my touch typing. All those brackets, braces, and parentheses coupled with Ctrl key combos for a typical programmer's editor are impossible to touch type.


Top
 Profile  
Reply with quote  
PostPosted: Wed Jan 07, 2015 4:33 am 
Offline

Joined: Sun Apr 10, 2011 8:29 am
Posts: 597
Location: Norway/Japan
You're right there - and it's even worse, I think, with my national keyboard.. {[]} characters need AltGR! I have a US keymap in my head too so I can use a keyboard like that also, but then I'll miss out on other characters for other kinds of writing so I prefer to stick with the national layout. A bit off-topic, but the operating system interface of the Norsk Data minicomputers I used to work with back in the day was optimal for a Nordic keyboard layout.. the '-' char is where a slash is on a US keyboard, and all commands were of the type 'LIST-FILE' and 'LIST-OPEN-FILES', 'LOOK-AT-MEMORY' and so on (no need to actually write upper case though). Everything could be shortened as long as it was unambigous, so after you got proficient you would simply write 'LI-FI', or 'LI-O-F', or 'LO-A-M' (or even 'LO--M' if unambiguous) and so on. No shift keys, and very quick to enter. Particularly helpful before the days of tab-completion, not to mention when having to use a mechanical (paper) TTY.

-Tor


Top
 Profile  
Reply with quote  
PostPosted: Wed Jan 07, 2015 5:41 am 
Online
User avatar

Joined: Fri Aug 30, 2002 1:09 am
Posts: 8429
Location: Southern California
I'm having trouble latching onto what you mean by a "typist's assembler;" but I might comment (to give ideas) that my HP-71 (and to a lesser extent, my HP-41) hand-held computer offers different kinds of key assignments, with key-assignment files that could be swapped out. The equivalent on a PC keyboard might be for example that <ALT-V> might give you "VARIABLE" and <ALT-C> might give you "CONSTANT" but you could can make each key anything you want, whether typing aid (like my examples), direct-execution, or immediate execution. These might have been more valuable on the 60%-size QWERTY keyboard on the HP-71 though, which in my experience is good for not more than about 30wpm. (I type about twice that fast on a standard-sized keyboard, and I know people who can type close to 100wpm.)

_________________
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?


Top
 Profile  
Reply with quote  
PostPosted: Wed Jan 07, 2015 9:02 pm 
Offline
User avatar

Joined: Fri Dec 11, 2009 3:50 pm
Posts: 3350
Location: Ontario, Canada
Nice work, Scot. Although I don't find the Ragsdale syntax objectionable, what you propose is also highly workable.
scotws wrote:
There is one special case: Because AND is also a Forth word, its Absolute Addressing opcode gets a dot as "and."
As a matter of personal preference, I would require all mnemonics (even implied, such as dex) to end with a dot. Yes that mean extra typing but in my view avoiding special cases justifies the price. That's a personal preference, as I say. (And, as you may've guessed, I'm not a touch-typist.)

Also, it's conceivable that other name-collisions (besides AND) may arise, for example due to structured macros or due to additions to the Forth dictionary unrelated to the assembler.

Quote:
writing the assembler with this brute-force approach was in fact downright trivial, just as advertised. I can see how somebody who actually knows what he or she is doing can write them in a few hours. [...] My appreciation for Forth has grown enormously.
Always nice to see Forth praised, I say! BTW, it sounds as if your skills have also grown! :)

-- Jeff

_________________
In 1988 my 65C02 got six new registers and 44 new full-speed instructions!
https://laughtonelectronics.com/Arcana/ ... mmary.html


Top
 Profile  
Reply with quote  
PostPosted: Mon Feb 09, 2015 10:41 pm 
Offline

Joined: Mon Jan 07, 2013 2:42 pm
Posts: 576
Location: Just outside Berlin, Germany
So this is nice: Samuel Falvo II got wind of my little assembler and was inspired to adapt it as a one-pass assembler for his Kestrel project based on the RISC-V CPU: https://github.com/sam-falvo/kestrel/tr ... 3/src/vasm

When I say "adapt", of course, I mean he rewrote the whole thing. But it is still nice to know that it was useful for somebody :D .


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 7 posts ] 

All times are UTC


Who is online

Users browsing this forum: JimBoyd and 12 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: