Introducing a Tinkerer's Assembler for the 6502/65c02/65816
Posted: Fri Dec 04, 2015 1:32 am
It very quickly has became apparent that for writing test cases for my new 65816 emulator the little Forth single-pass assembler I wrote, as nice as it is, is not going to be enough. To save my sanity, I'll need something more powerful like a classical two-pass assembler. Since I've gotten use to the Typist's Assember Notation, it became clear that I'd either have to adapt an existing assembler or write my own.
But here's the problem: Classical two-pass assemblers are basically little compilers, which involve stuff like parsers and tokens and lexers and abstract syntax trees and whatnot. I'm sure all you real computer science types just whip these up off the top of your heads, but for us hobbists, the dragon book even sounds scary. Adapting other people's code means understanding their parsers, etc, which sucks. It's just not easy to tinker with them.
At this point, Samuel A. Falvo II came to the rescue with a reference to "nanopass compilers" (http://kestrelcomputer.github.io/kestre ... -compiler/), based on a paper by Sarkar et al (http://www.cs.indiana.edu/~dyb/pubs/nano-jfp.pdf). I didn't even get past the abstract before I was writing an assembler based on lots and and lots of very small, easy to understand passes.
So here is the BETA of such an assembler for our favorite MPUs. The design goal - apart from the actual assembly thing - was to create something that would be very easy for a hobbyist to mess around with, which is why it is a "Tinkerer's Assembler" (https://github.com/scotws/tinkasm). It currently consists of more than 20 passes (more or less, depending on the processor), each aiming to be simple. The code itself is written not only in Python, a very widespread language with famously easy to understand code, but in "primitive" Python - no objects, no functional programming, no maps or filters, few list comprehensions. It uses IF/ELSE and TRY/EXCEPT to show the logic even where it is horribly inefficient. The code starts at the beginning, goes to the end, and then quits (which all really annoys pylint, by the way). It loads no external files and only standard "batteries included" external libraries. (Dummy code example with vim syntax highlighting for Typist's Assembler)
The downside is that as a pure assembler, for obvious reasons, it sort of sucks. With our small file sizes, speed doesn't matter too much, but still. This is not the program you want to use for raw speed. Also, the current version is still pretty basic. For example, there are macros, but they don't have parameters yet (that comes next). It's also still missing the more advanced macro functions such as IF/THEN/ELSE. The lack of a real parser means that the math functions are rather primitive, pretty much limited to one operator and two operands, but for most assembler stuff, that might be enough.
As an aid to tinkering, to see what happens step for step, the assembler can be told to produce human-readable snapshots after every pass. I've attached the stout for the little test program above which amounts to almost 1,000 lines (the "frog" file name has no meaning, it's my version of foobar, I just realized I forgot to change it). It also can be told to print a listing file (which I'm still experimenting with) and a hexdump in ASCII.
So now I have a little-tested assembler to test my little-tested emulator with. This, ah, might not be the most traditional procedure.
I'll probably be playing around with the assembler first for obvious reasons (a rewrite of Tali Forth for the 65c02 might be in order). The good news is that I'm rapidly running of ways to procrastinate and might have to actually get some real stuff done now ...
But here's the problem: Classical two-pass assemblers are basically little compilers, which involve stuff like parsers and tokens and lexers and abstract syntax trees and whatnot. I'm sure all you real computer science types just whip these up off the top of your heads, but for us hobbists, the dragon book even sounds scary. Adapting other people's code means understanding their parsers, etc, which sucks. It's just not easy to tinker with them.
At this point, Samuel A. Falvo II came to the rescue with a reference to "nanopass compilers" (http://kestrelcomputer.github.io/kestre ... -compiler/), based on a paper by Sarkar et al (http://www.cs.indiana.edu/~dyb/pubs/nano-jfp.pdf). I didn't even get past the abstract before I was writing an assembler based on lots and and lots of very small, easy to understand passes.
So here is the BETA of such an assembler for our favorite MPUs. The design goal - apart from the actual assembly thing - was to create something that would be very easy for a hobbyist to mess around with, which is why it is a "Tinkerer's Assembler" (https://github.com/scotws/tinkasm). It currently consists of more than 20 passes (more or less, depending on the processor), each aiming to be simple. The code itself is written not only in Python, a very widespread language with famously easy to understand code, but in "primitive" Python - no objects, no functional programming, no maps or filters, few list comprehensions. It uses IF/ELSE and TRY/EXCEPT to show the logic even where it is horribly inefficient. The code starts at the beginning, goes to the end, and then quits (which all really annoys pylint, by the way). It loads no external files and only standard "batteries included" external libraries. (Dummy code example with vim syntax highlighting for Typist's Assembler)
The downside is that as a pure assembler, for obvious reasons, it sort of sucks. With our small file sizes, speed doesn't matter too much, but still. This is not the program you want to use for raw speed. Also, the current version is still pretty basic. For example, there are macros, but they don't have parameters yet (that comes next). It's also still missing the more advanced macro functions such as IF/THEN/ELSE. The lack of a real parser means that the math functions are rather primitive, pretty much limited to one operator and two operands, but for most assembler stuff, that might be enough.
As an aid to tinkering, to see what happens step for step, the assembler can be told to produce human-readable snapshots after every pass. I've attached the stout for the little test program above which amounts to almost 1,000 lines (the "frog" file name has no meaning, it's my version of foobar, I just realized I forgot to change it). It also can be told to print a listing file (which I'm still experimenting with) and a hexdump in ASCII.
So now I have a little-tested assembler to test my little-tested emulator with. This, ah, might not be the most traditional procedure.