SWIFT - a FORTH-like language written for BBC Micro 1980s

davidmjc · Post by **davidmjc** » Sat May 31, 2025 1:27 am

I wonder if anyone here might be interested in a FORTH-like language that I wrote for the BBC Micro as a teenager in the 1980s. I called it "SWIFT" because it was faster than any other language available for the BBC Micro at that time (as measured by the Ackermann function test, which was popular then).

It was a language that compiled the source into a dense byte-code, which was then interpreted by a machine code program with a very tight loop to execute each bytecode instruction. This is what made it fast. What I didn't realise at the time, was that the interpreted byte-code approach would make the language very portable (as Java subsequently showed!).

The interpreter used a FORTH-like stack to execute a program, and so the source code was written in RPN, with some standard FORTH stack manipulation commands as basic operations.

To implement the compiler, I used bootstrapping: I wrote the compiler in SWIFT itself, and then wrote a simple BASIC program that was sufficient to compile this into executable byte code. Once I had this SWIFT compiler in byte code, I used it to compile its own source code to obtain an optimal version. I finally wrote an interface in BASIC to allow one to interpret or compile SWIFT code, and then tested it on the Ackermann function.

I had ambitions to extend the compiler (written in SWIFT) to convert infix notation to RPN, but I got distracted by other things such as A Levels and university.

GARTHWILSON · Post by **GARTHWILSON** » Sat May 31, 2025 2:43 am

It sounds like token-threaded Forth, and also like the user language of the HP-41 calculator/computer which made the most-used instructions (or the ones the designers thought would be most-used) a single byte each, but allowed for an unlimited number of other functions by their two-byte XROM numbers (not addresses). With that, and RPN, it made extremely efficient use of program memory.

I would be interested to hear how your Swift is different from Forth, and whether it could be molded (perhaps by the user himself) into a token-threaded Forth now that you probably know more about Forth.

BTW, Forth, Inc. has the names "SwiftForth" and "SwiftX" as registered trade marks.

BigEd · Post by **BigEd** » Sat May 31, 2025 6:29 am

Sounds very interesting to me, especially the bootstrapping! (But then I'm already in an Acorn frame of mind.)

davidmjc · Post by **davidmjc** » Mon Jun 02, 2025 1:00 am

Thank you for your interest. I think I knew more about FORTH in the 1980s than I do now (!) - DeGrandis-Harrison was almost certainly an inspiration at the time. However, I wanted to use the compilation step (from source into byte code) to make SWIFT programs more like other languages, so that the FORTH paradigm was more under the hood.

The bootstrapping was inspired by my parent's work on LSD (Language for Systems Development, no drugs involved!) in the late 60s and early 70s - a similar idea to C. I figured that if I could write the compiler in the language, then I could add a lot of syntactic sugar: for example, the compiler could convert infix notation into RPN. I also wanted the analogue of a disassembler, which would provide a readable and compilable version of any bytecode program (does FORTH provide such?). Almost none of these goals were achieved before I was distracted by A levels and University life.

The term "token-threaded" sounds like an accurate description of SWIFT - the latter was not an acronym for anything, just a random name to give to a fast language. If I rejuvenated it now, I would also emphasise the benefits of compact portable code: the bytecode interpreter was very small and simple, and so could be implemented on any microprocessor in almost no time: on the BBC micro, it had to fit into a just a few pages (one benefit of working with so little memory is efficiency!). It had a small read loop for the byte code, and a table of addresses for loop to act on each instruction it found.

I think I only needed 128 primitive instructions, and probably used the other 128 possibilities for small data. But I would have to dig though what I have left now, some of which is on floppy drives from 1980s, and the latter may not be recoverable. I am close to retirement, so this could be an interesting project, and I would welcome comments, suggestions, advice, contributions...

GARTHWILSON · Post by **GARTHWILSON** » Mon Jun 02, 2025 3:33 am

davidmjc wrote:

I also wanted the analogue of a disassembler, which would provide a readable and compilable version of any bytecode program (does FORTH provide such?).

The Forth word SEE de-compiles things, but of course won't give you source code that has the original indentation and other white space. It typically gives you not just the names of the words, but addresses and perhaps other things that would get in the way of recompiling without some manual editing. SEE has no idea what you want to do with various data; so for your convenience, I made mine present it four different ways, in four columns, both hex and decimal, both signed and unsigned.

Quote:

The term "token-threaded" sounds like an accurate description of SWIFT [...] I think I only needed 128 primitive instructions, and probably used the other 128 possibilities for small data. But I would have to dig though what I have left now, some of which is on floppy drives from 1980s, and the latter may not be recoverable.

The only problem I've found is my disc drive controllers going south, not the discs themselves.

I can give a few things that were done on the HP-41c/cv/cx calculator/computer which might give ideas to incorporate if you're not already past that point.

Entire rows in the 256-byte table are dedicated to certain things. The first 15 numeric labels (which are local to whatever program they're in, and you can have lots of programs in memory at once) are one byte each, for example LBL 03. When the program runs, a search for the label always goes forward, and if it's not found by the time the search reaches the end of the program, the search continues from the beginning. That way, these can be re-used if branches to them are always forward. (Numeric labels beyond LBL 14 will take two bytes each. You can also have local alpha labels, like a, b, c, and global alpha labels, like ENCRYPT, JTEMP, whatever.)
Similarly, the first 15 GTOs (go-to instructions) to numeric labels, like GTO 03, are one byte each, plus the address as described here: In the GTO <label>, the first time it's found, the address is compiled with the GTO, so the next time the same GTO is encountered, it just looks at that address and makes sure it's still the intended label, so it doesn't have to search again, so it's faster. If what's found there is different, because things have been scooted around in memory, for any of several possible reasons (mostly program editing or deletion), it will search again, and then compile the new address it found.
An entire row is dedicated to STO 00 to STO 15, to store the top-of-stack register into these data registers, and another row to RCL 00 to RCL 15, to recall any of the first 15 data registers to the top-of-stack register X. (The 41 does not allow named variables, but sometimes I write programs in a text editor with names and then make a copy and search-and-replace, names for register numbers. Files can have names though.) Stores and recalls involving registers other than those in the 00-15 range will take another byte. (With synthetic programming, you can also access registers the manufacturer never intended for users to even know existed! )
Indirects of course cannot have an address compiled, since it will not be known ahead of time what will be in the register that will point to the desired label or register.
Unless you use the text editor in the 41cx, you cannot enter text strings of more than 15 characters in one program instruction. If you want them longer, you have to concatenate. One row in the table is all instructions to tell that it's text, and how many characters; so for example, "Vcc" takes four bytes total.
The XROM numbers I mentioned earlier have only half a row. The whole XROM thing is for nearly limitless expansion, via plug-in modules, and on the newer hardware versions, ROM images that are in flash sections can be switched in and out. There's XR 0-3, XR 4-7...XR 28-31. (There are over ten times that many ROM images available; but ones that share numbers are ones that are not expected to be active in the machine at the same time. Also, in ROM address space, which is separate from data and user address space, there's a 64K limit, not including banking.) Addressing an XROM function takes only two bytes. After the XR byte mentioned here, there's one more byte; and two bits of it tell which number (for example in the 0-3 range, or the 4-7 range, etc.), and the six remaining bits tell which of 64 functions which are allowed in one module. I don't know how the OS looks up addresses of these functions quickly. There's more to it than that, with modern modules having sub-FATs and other tricks to get lots more there than Hewlett-Packard's designers ever intended. There are some amazing engineers and programmers still extending the 41 with new hardware and software. They are way, way beyond my own understanding of the insides of this thing. When you enter or view functions with XROM numbers, the functions' names (not XROM numbers) appear in the display.

davidmjc · Post by **davidmjc** » Tue Jun 03, 2025 11:49 pm

For the moment, I do not understand what you write, but once I recover what I did, I will let you know.

davidmjc · Post by **davidmjc** » Sat Jun 14, 2025 1:26 am

I was finally reunited with my notes and disks on Swift this week. I can't read the disks yet, but my notes begin with the assertion that "Swift is a threaded interpreted language", then include info on the general approach (e.g. a parameter stack and a return stack), a list of primitive commands, and handwritten source code for the interpreter (compiler/executer), where I jokingly note that the Swift compiler is "alas written in Swift"

Alas also, the notes do not include the 6502 assembly to execute compiled bytecode, but I now understand why it was so fast. In short, Swift exploited the fact that all program code would be below 8000 hex (ofc the rest is the BASIC interpreter and OS), so a program could just be a list of 16 bit addresses in the range 0-7FFF, together with primitive functions in the range 80-FF; the latter were machine coded at addresses linked from a single page. So all the main loop had to do was test the next byte for less or geq 80 hex, then, in one case get the next byte as an address, and in the other case do an indirect JSR to a primitive function - this barely 10 lines of machine code overhead compared with complete compilation into machine code.

BigEd may be interested to know a bit more about the bootstrapping. The interpreter (compiler/executor) did not need all the functionality of Swift, so I first wrote it in a kind of "Pig-Swift", with some added (low-level) shortcuts.. This was then easy to compile into bytecode using a program written in BASIC. Then with a fully functioning compiler, I could rewrite the compiler in Swift proper, then compile this. Of course a last check iwa to compile the compiler using itself and check the result was identical.,

GARTHWILSON · Post by **GARTHWILSON** » Sat Jun 14, 2025 2:40 am

That sounds like an excellent way to do token threading. I hadn't thought of that.

BigEd · Post by **BigEd** » Sat Jun 14, 2025 1:23 pm

Yes indeed, thanks for the note on the bootstrapping!

There is the historical idea of writing a compiler in its own language then hand-running it on itself to produce a binary. (Of course, minimalising the language and the compiler at this stage makes good sense - and indeed, your pig-swift sounds very much like that.)

BruceRMcF · Post by **BruceRMcF** » Mon Jun 16, 2025 3:55 am

davidmjc wrote:

... Of course a last check was to compile the compiler using itself and check the result was identical.

For those fortunate enough to have the less daunting task of porting a bytecode language that can be the first check on the bytecode interpreter ... since you already have the bytecode compiler as compiled on the original working source interpreter.

For the 65C02, an option for that kind of "half addresses, half byte code" token threaded system is to pad the start of all routines on an odd address, so that addresses can be stored in natural 65xx "little endian" order, and "BIT #1" can test whether it is the first byte of an address or one of 128 byte tokens.

Now, that might seem like it costs two clocks, but only for the routine addresses ... since it means the byte tokens are spaced two bytes apart, a table of primitive vectors can be set up in a page, say "primitive", and dispatched by "TAX : JUMP (primitive,X)":

Code: Select all

        INC IP
        BNE +
        INC IP+1
+       LDA (IP)
        BIT #1
        BNE ENTER
        TAX
        JMP (primitives,X)
...

(... or you can LDA (IP),Y for a faster increment of the IP but more complexity of ENTER.)

It still can work in NMOS6502 ... preload the $01 in some location in zero page (called ONE so you can do "BIT ONE"), and have a "DISPATCH: JMP (PRIMITIVE)" code in RAM somewhere, so you "STA DISPATCH+1 : JMP DISPATCH" ... but I suspect the extra 5 clock cycles overhead may leave the high bit as the flag on the inside track, speedwise.

If you have 32 byte words and a 5bit length entry in the prefix byte, 1 bit for an Immediate flag, and 1 bit for a smudge bit, you can have 1 bit for a primitive, and the outer interpreter knows the XT is not an address (bumped up by one if the address after the name field is even) but is the byte stored after the name, with a zero high byte, and does a C@ with the address following the name field, rather than doing a >ODD-PARITY operation.

And then XT, knows to only compile one byte if the high byte of the XT is 0.

If you have fully allocated the primitives, you can also tell between dictionary entries of primitives and compiled routines by putting all of the primitives in the dictionary first and storing a guard rail address where below the guard rail, all of the names are primitives ... but the bit in the prefix byte allows you to leave some primitives unallocated for the user to be able to define their own "fast primitives", which don't need "the following code is binary" primitive as a prefix to start the user-defined primitive.

It also allows you to organize the base dictionary with in an easier to port "eForth" style with few primitives and most of the system in compiled routines built on top ... to incrementally faster systems which implement more actions as primitives after profiling which compiled routine are taking up the most execution time, without having to reorganize the source every time a routine is rewritten as a system primitive.

davidmjc · Post by **davidmjc** » Wed Jun 18, 2025 3:59 pm

Many thanks for the interesting comments. Fortunately I had BASIC, so did not need to compile the compiler by hand - in fact I am now struggling to understand the compiler source code! I took the big-endian approach for convenience, but appreciate how a little-endian approach could be implemented.

One thing that I learned from this 1980s endeavour is that if you want to make an interpreted language fast, then you must optimise the main loop: there should be no redundant code therein.

BruceRMcF · Post by **BruceRMcF** » Wed Jun 18, 2025 9:29 pm

Another little endian approach which might work better for an NMOS 6502 may be where the routines are bumped if need be to start on an even address, and the lead little endian byte is shifted down. Then the shifted low byte always has the high bit clear, and primitive tokens are the bytes with the top bit set

Code: Select all

        LDA (IP) ; 20 clocks from this point to jump to Primitive
        ASL
        BCC ENTER
        STA DISPATCH+1
        JMP DISPATCH
...
; down in ZP:
DISPATCH:
        JMP (PRIMTABLE) ; "PRIMTABLE" is page aligned

Of course, doing this with a 65C02 will still be more efficient:

Code: Select all

        LDA (IP) ; 17 clocks from this point to jump to Primitive
        ASL
        BCC ENTER
        TAX
        JMP (PRIMTABLE,X)
ENTER:
        ...

SWIFT - a FORTH-like language written for BBC Micro 1980s

SWIFT - a FORTH-like language written for BBC Micro 1980s

Re: SWIFT - a FORTH-like language written for BBC Micro 1980

Re: SWIFT - a FORTH-like language written for BBC Micro 1980

Re: SWIFT - a FORTH-like language written for BBC Micro 1980

Re: SWIFT - a FORTH-like language written for BBC Micro 1980

Re: SWIFT - a FORTH-like language written for BBC Micro 1980

Re: SWIFT - a FORTH-like language written for BBC Micro 1980

Re: SWIFT - a FORTH-like language written for BBC Micro 1980

Re: SWIFT - a FORTH-like language written for BBC Micro 1980

Re: SWIFT - a FORTH-like language written for BBC Micro 1980

Re: SWIFT - a FORTH-like language written for BBC Micro 1980

Re: SWIFT - a FORTH-like language written for BBC Micro 1980