resman wrote:
Would it be possible to create a subform, something like "6502 and Beyond" to explore such possibilities without polluting the true nature of this site?
Answering this first - That's up to the people who run this site - not me. I'd like to think another sub-forum would work, but who knows...
resman wrote:
Gordon-
I've also been contemplating my future direction with the 6502 environment. My own project, the AppleIIPi, was an attempt to integrate the Apple II and 6502 into a modern Linux environment. I still use this everyday and it satisfied that itch, but skipped the in-between stage of a modern equivalent of a retro 6502 platform.
Curiously, since moving to the Pi, more possibilities have opened up for me - one thing I was never able to successfully do with my 6502 SBCs was nice video - and I think that's still something that eludes other too, but ... writing a 65C02 emulator in BCPL (or even ARM assembler) is relatively trivial, giving it some video RAM, or a framebuffer of sorts, porting my OS (or any other OS) to it is do-able, then I have a 65C02 with screen, keyboard and storage... Full circle as it were... but it does go somewhat against my initial desire to have real, physical hardware in-front of me.
I would also add that I'm not wedded to the Pi - it's convenient in this instance and I am very familiar with it but it's overly complex for this. At least I'm not running Linux on it. It's also a vehicle to making the bytecode part a "real" CPU implemented via the PiTubeDirect system on a BBC Micro.
Quote:
I think we both appreciate abstracting the 6502 through techniques such as bytecode VMs, so the idea of abstracting 6502 constructs to modern RISC CPUs would be a logical step. I'm intrigued by your experiments with porting the BCPL interpreter to ARM and RISC-V and the lessons learned.
Well, the experiments worked and the main lesson here is.. do it sooner.
The key thing that helped was having the OS and utilities written in a high level language. Binary compatibility was a bonus - easy for a bytecode interpreter - harder if I'd had to re-write the code generator in the compiler.
The ARM has some 6502 ideas - pipelined architecture, indexed addressing modes and similar condition codes come to mind, but somewhat expanded. The early ARMs had 10x the transistors of the 6502 though.
Also, running my own OS on a faster platform has actually enabled me to find bugs in the '816 version - mostly because I'm doing things on the ARM that I'd not think to do on the '816 as they'd take too long - writing the Bubble Universe for example found some bugs in my terminal/graphics drivers and writing some of the bytecode instructions in RISC-V then ARM gave me some ideas about improving the efficiency of the '816 code. There's only so many ways to calculate a Mandelbrot too, but now I can move from ASCII to colour graphics...
I'm now pondering if I ought to have made the jump to 64-bit. I started with 16-bit BCPL on the 8-bit 6502, then 32-bit BCPL on my Ruby 816 system, but now stuck to 32-bit on a 32-bit CPU? 64-bits may have to wait until I jump to a 64-bit CPU. (Taking BCPL to 64-bit is relatively trivial - there is just one bytecode instruction needed to handle it and the compiler does the rest)
But that goes back to one desire I had which was for a fully self-hosting platform. I know there are C compilers out there (for the 6502) and endless Basics but it seems we've lost, or almost lost the means to compile C on a 6502 system - I have the Aztec C disks for the Apple II, but would that help on my own 6502 SCB? Not really. Cross compiling? Well, anyone can do that but we didn't do in in 1980 so why do it now? (And hands-up, yes, I cross assemble 6502 and '816 code on my desktop) When I started on the '816 system there was not a workable C compiler for it that I could use.
The beauty of the bytecode is that I was able to run the OS and utilities as a binary on the other platforms without re-compiling it or the supporting libraries. I am able to compile the central 'exec' (the CLI and support) directly on the system, install and test it without rebooting or resorting to cross compiling. (I can do this on the '816 but compiling it takes 2 minutes on the '816 rather than 1/2 second on the ARM)
And the high level language thing - I've developed a multi-tasking OS in it with a nice (to me) CLI, the usual/generic utilities and so on in far less time that if I'd done it all in assembler - but yes, there are trade-offs - speed being the main one. But historically, using a high level language has been the way forward for things like portability. Unix 1,2 and 3 were in assembler, but v4 onwards in C - complexity was rising and it ultimately enabled it to be ported to systems other than the PDP11..
My first porting effort - to the RISC-V platform did involve some changes to the BCPL and re-compiling bits of the bootstrap system, but mainly to do with passing in data like ram regions and so on. When I developed the system initially, I had a concept of "Lo" RAM and "Hi" RAM. Lo RAM was Bank 0 in the '816 and was necessary because program stacks and the "global vector" had to live in Bank 0 as I was using 16-bit pointers to access that data. This was a huge speedup for those operations on the '816, so to launch a new program, I needed to allocate RAM for the program (Hi RAM) and RAM for the stack and global data (Low RAM) so there are effectively 2 heaps and 2 versions of getvec (the BCPL malloc) I changed this slightly to "fast" and regular RAM for the RISC-V (same RAM, but maybe in some systems they might be faster?) and kept this for ARM. But other than that there really weren't many changes needed, other than the slog of translating the '816 code into RISC-V, then into ARM. It got easier and as learned more ARM it got faster - what can take 9 instructions in '816 land takes 6 instructions in RISC-V land and takes just 2 in ARM land.... (It's almost as if ARM was designed to be an emulator for bytecodes!)
And lacking real hardware initially, being able to use the existing system to emulate a RISC-V CPU was good too. It was goo enough to enable me to boot the existing OS under the emulated RISV-V CPU too. I had working ARM hardware and an existing framework that would let me boot the ARM via network (PXE) which made testing the ARM version much easier. (Although I could just have booted it under Linux, but I didn't)
Quote:
I think what I'd like to investigate are extensions of the 6502 into the 32 bit realm. Not exactly the 65832 discussion (which didn't seem to drive all the 8 bit diehards away), but something along the lines of 6502 inspired macros that would expand to native 32 bit (perhaps even 64 bit) ARM/RISC-V code. I've also recently enjoyed playing with the lib6502 simulator that could easily be employed to provide 8 bit binary compatibility for 6502 code. I used this exact concept to implement a PLASMA testing platform under Linux, sans the ability to run native 32 bit code. lib6502 could easily be modified to allow jumping into the native 32 bit code environment that in turn could provide basic I/O and filesystem capabilities for whatever environment *it* was running under - from bare metal to Linux.
Well - given that the 6502 is Turing complete, then anything's possible
Macros (see also the Acheron system), your Plasma and things like Sweet-16 can be used to give you alternative environments - at the cost of execution time but the convenience of using them can outweigh that disadvantages. when I was toying with another project (a high performance BASIC for the 6502), I wrote it's memory allocator in Sweet-16....
The big thing I think needed to move beyond 8 bits in hardware is a wider data bus. That alone will save cycles - but in the retro world pushes up costs and potentially chip count. The '816 needs 2 cycles to read a 16-bit data value or 4 if it was expanded to a 32-bit system... And it still infuriates me that I have to AND #$00FF when reading a byte from RAM in 16-bit mode.
One issue with wider data and address buses is the chip pin count - that's when it starts to get out of the through-hole hobby reach. 32 data bits and even 24 address bits plus supporting clock and IO signals really takes out into QFP or PGA style packages - neither which are hobby hostile, given a keen hobbyist, or we're back to some sort of SoC with embedded RAM. My ideal "wide" 6502 would have native width buses with a generous
linear address space (and I really only need 24 bits at most) the ability to fetch bytes - but the issue may then be instruction width and alignment. Early > 8 bit CPUs didn't handle un-aligned data well - even the early ARMs didn't until v6, so you had to do loads, masks and shifts. What about load immediate? LDA # ... You need 5 bytes for that to load a 32-bit value and when memory is 32-bits wide, is that going to take up 2 x 32-bit words with 3 bytes wasted, or do we do it another way? Many questions to ask and think about.
-Gordon