daniMolina wrote:
Are there any better options?
Cheers!
Hi,
Better? I don't know. Let me describe an alternative...
While I use cc65 on the 65C02, I've not tried any C on the '816. The WDC compiler isn't an option for me - and ok, I've not actually tried it, but when investigating it, it's MS Windows only and while it can and does run under WINE, that's really not my thing.
So when I moved from the 65C02 to the '816, after porting my 65C02 OS which has ideas taken from the Acorn MOS, and capable of running some Acorn software in 65C02 emulation mode - e.g. BBC BASIC, I thought about all this 16-bit goodness and what to do with it.
I moved to the '816 because other projects seemed to be using it - it was mooted for the Commander X16 and the Foenix but they turned out to not happen or just be too expensive and taking too much time.
On a personal note, I'm a little disappointed with the '816. Yes, there were some good projects made with it - the Acorn Communicator then the Apple //gs and latterly the SNES but it's not trivial to code for. Going 65C02 to 65C816 is actually a big step. Issues you have to think about are switching between 8 and 16-bit register and memory modes, indexing beyond 64KB, extra stack addressing modes, movable direct page and many others. It has some nice features - maybe multi-tasking with each task having it's own 64KB of RAM? Maybe some high level language to make use of the extra stack addressing modes? Or maybe even a set of good macros for assembler - but then you end up with something like Sweet16 (32?)/Acheron/etc ... But you still need to write code in assembler, even if it is a good macro assembler.
A good C compiler ought to be able to hide this from you - give you the ability to write programs > 64KB, address data structures > 64KB, but yes, there are trade-offs - manipulating what is effectively a variable pointer size - how much hand-holding do you want to give the compiler and how much do you want it to do for you? Anyway ...
So I made my system anyway - kept it simple, used GALs and an ATmega "host" processor - just because I could and it was actually relatively easy to adapt my old 65C02 board.
So here I was, with an 65816 system with 512KB of RAM on my desk - what to do? My intention had been a sort of "what if" ... as in what can I do with todays knowledge that I might not have thought of back then.
And the answer was along the lines of "forget it, there are better CPUs". In 1985 the 68K was real, ARM was on on the horizon as was the 80386, the and here in the UK we were looking at the 32-bit Transputer and the INS32016 (which turned out to be a bit of a dud, but that's another story).
So here I was, with an 65816 system with 512KB of RAM on my desk - what to do? I went back to my ideas of a self-hosting system - what language? Not C, so ... BCPL.
I've been an on/off fan of BCPL since the early 80's. It's capable of self-hosting, the compiler is fast enough on old hardware, the tricky thing was making the current 32-bit BCPL system run on a hybrid 16 bit system with an 8-bit memory bus. The compiler can output a few different code formats - one is a format designed to be translated to native code by some external program, another is a bytecode designed to be interpreted on anything you care to write the bytecode interpreter for. That's the default and is the route I took.
So I wrote my bytecode interpreter (the bytecode is called CINTCODE - Compact Intermediate Code) and I wrote it in such a way to make the memory of the '816 appear as a contiguous 32-bit address space. That way BCPL programs could allocate > 64KB of RAM and iterate over it without any issues. Yes, this can be done in assembler and I think (hope) using the WDC C compiler too, but I had BCPL and wanted to use it.
My bytecode VM is 16KB of '816 macro assembler. It's as fast as I can make it without spending 10x more time on it - loops are unrolled, the execution environment is essentially threaded, there are no JSRs in the run-time of the interpreter - the bytecode dispatcher is built into every single decoded byte-wide instruction (all 255 of them). This takes 29 cycles, or 37 cycles when it has to increment the high word of the VMs program counter. That in itself is the speed limitation - at 16Mhz it's not too bad. Cintcode is very efficient and the compiler makes a good job of optimising it.
So my Ruby '816 boots into the old RubyOS written in '816 code (really 65C02) and can load programs from disk (SD card) via the ATmega - that lets it run BBC Basic with a RAM size of about 28KB, or it can load and run the BCPL OS. This is the 16KB bytecode interpreter/VM and that then boots directly into BCPL which loads up the rest of the BCPL OS. There is a shared library of routines and a command line interpreter. It's also multi-tasking, but bearing in mind there is no memory protection ...
Here is a demo - the "star" prompt is the RubyOS prompt and here we're in 65'816 mode. I run the BCPL program which is a 16KB "ROM" image (BBC Micro users will be familiar with this - it's 16KB from 0x8000) that loads the initial BCPL bootstrap and that then loads up the library and finally the CLI.
https://www.youtube.com/watch?v=ZL1VI8ezgYcI run a program to calculate Pi to 100 digits - it takes just over 6 seconds. I then execute a script that fires off 12 clock programs, all running concurrently (displaying a random time, but updating once a second), then I re-run the Pi program which then takes just over 8 seconds as you don't get something for nothing.
So I have a CLI, editor, compiler and some unix-like utilities like ls, mv, cp ... The world is my bi-valve, as they say...
There is work to do but I'm not sure I'm up for it right now. Some of the library isn't "thread safe" and while very usable it's not perfect.
So why am I typing all this up ... Don't know. I've sort of lost enthusiasm on it all and not done any work for a year or so, so thinking of re-kindling it - wondering if it could be ported to someone elses '816 system? Or just move onto something else...
And as far as porting goes ... Well, along with history, I have wondered "what next" and moved to another CPU - RISC-V. It's still "retro" from my point of view, being based on the designs of the early 80s... So I wrote a RISC-V emulator in BCPL, used that to run some C programs on my '816 board (C compiled to RV binary), then (re) wrote the BCPL bytecode VM in RISC-V assembler and ran that under my BCPL emulator and it worked and booted by BCPL OS. It wasn't fast, but it did run. Subsequently I've ported it to some real RISC-V hardware (ESP32-C3 with 400KB of RAM) where it absolutely flies. Cycle for cycle, the RISC-V version is 5+ times faster...
Cheers,
-Gordon