6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Sun Nov 24, 2024 6:02 pm

All times are UTC




Post new topic Reply to topic  [ 54 posts ]  Go to page Previous  1, 2, 3, 4  Next
Author Message
PostPosted: Thu Apr 27, 2023 9:17 pm 
Offline

Joined: Fri Jun 03, 2016 3:42 am
Posts: 158
Proxy wrote:
Hugh Aguilar wrote:
Nobody has ever indicated any practical use for the bloated monster

Now you've lost me... RISC-V is designed to be simple and easy to understand with optional extensions to increase functionality if desired.. It's pretty much the opposite of bloated as you only include what you need.
Moat current RISC-V systems aim for modern functionality so they use the more complex Privileged Spec for things like memory protection, multiple system modes, etc.
Of course such features could seem like "bloat" to people who've mostly worked with basic 8 bit systems. :wink:

I'm really aiming for simplicity with my processor design, although it is 16-bit rather than 8-bit.
The target is the iCE40HX8K FPGA, which is the lowest-cost FPGA that I could find.

My previous experience was writing MFX (assemble/simulator and Forth cross-compiler) for the MiniForth processor at Testra. This did not have a 16-bit addition instruction. It did not have any way to change the PC other than the NXT instruction (it did have conditional loads of registers though). Up to five instructions could be packed into a single opcode and all five would execute in parallel in one clock cycle. All of the instructions were very simple. There were no operands or addressing-modes. Programming in assembly-language was not easy --- the upside was that the MiniForth fit on a Lattice isp1048 PLD and ran at 40 Mhz., so it was less-expensive and faster than the competitor's MC68000 board.

This is the way that it goes.
The possible features are: joyful assembly-language programming, high performance and low-cost --- pick any two!

Compared to the MiniForth, my new design is high-level and feature-full --- everybody else will groan at how low-level and feature-bereft it is.

Proxy wrote:
Hugh Aguilar wrote:
they just say that is a "joy to use" (I have no idea what that means).

did you never write assembly in a specific ISA and thought to yourself "damn this is nice to use"? that's exactly what they mean, inlcuding myself.
RISC-V has a very nice ISA to write assembly for. fully orthogonal (no specific registers like on the 6502/Z80), no status register to worry about, and of course with macros you can build yourself powerful pseudo instructions.

You would hate my design! :|
Every opcode is 16-bits. There are no operands except that some instructions have a 9-bit literal embedded in the opcode. Every instruction takes one clock cycle. The goal is to have a 100 Mhz. clock. Making this a "joy" for assembly-language programmers is not a goal --- it is not orthogonal at all; all of the instructions are for specific registers and most of them have side-effects --- it will be difficult to program in.

The primary feature that I'm aiming for is to support emulation of a byte-code VM efficiently. This provides diversity because there can be any number of byte-code VMs supported without changing the processor itself. It is only necessary to learn the assembly-language in order to write the primitives for the byte-codes and to write ISRs, but most programming will be done in a high-level language generating code for a byte-code VM.

I had expected this forum to be enthusiastic about making the W65c816 ISA a supported byte-code VM. Other retro processors, such as the MC6809 or MC6811 could also be supported, but these would be more difficult because they have multi-byte opcodes.

So far, everybody on this forum seems to be enthusiastic about the RISC-V, but nobody is interested in the W65C816.
I'm only interested in the W65c816 because it has single-byte opcodes and it already has a C compiler available. It is somewhat difficult to program in assembly-language, because it has very few registers. You can't have both a lot of registers, and a single-byte opcode, because every register that you add needs multiple instructions to work with it, so adding more registers results in running out of room in your 256-slot opcode-map. W65c816 programs overcome the problem of having few registers by relying heavily on direct-page variables --- this works for me because I can have several direct-pages mapped to internal memory for speed.

As an historical note, the 6502 was heavily criticized for having only a few registers, all of which were 8-bit. The Z80 manual from Zilog actually had an appendix devoted to criticizing the 6502 on this basis. The 6502 was actually a good design though, because 16-bit variables in zero-page could be used as pointers. My design also relies heavily on a zero-page, except that it has 9-bit addresses rather than 8-bit, and it is word-addressed rather than byte-addressed so you get 512 16-bit words rather than 128 16-bit words.


Top
 Profile  
Reply with quote  
PostPosted: Thu Apr 27, 2023 9:46 pm 
Offline
User avatar

Joined: Wed Feb 14, 2018 2:33 pm
Posts: 1488
Location: Scotland
Hugh Aguilar wrote:
I had expected this forum to be enthusiastic about making the W65c816 ISA a supported byte-code VM. Other retro processors, such as the MC6809 or MC6811 could also be supported, but these would be more difficult because they have multi-byte opcodes.


Possibly due to the ease of still being able to buy real hardware?

Hugh Aguilar wrote:
So far, everybody on this forum seems to be enthusiastic about the RISC-V, but nobody is interested in the W65C816.


The '816 lacks enthusiasm IMO because it was never really widely adopted. It found a niche in the Apple IIgs and the Acorn Communicator, then some years later then SNES but as a custom chip with extras. It's not as easy to code for as the RISC-V (or most other CPUs - at least the ones I've used). I think people are put off by the bank latching needed and the bank programming needed. More hardware and more thought required on the software side. There is also an argument that if you're going to use C then you might as well use C on any old CPU ...

Hugh Aguilar wrote:
I'm only interested in the W65c816 because it has single-byte opcodes and it already has a C compiler available.


That's also true of the 65(c)02. It's a single byte opcode with variable length operands - from 0 to 2 bytes on the 6502 or 0 to 3 bytes in the '816. The bytecode I'm using in my BCPL system also has variable length operands - from 0 to ... well, 100s or 1000s in the case of a large switch statement... but more typically 0 to 4 bytes (4 bytes to load a 32-bit constant, so 5 bytes in total, non-aligned).

Hugh Aguilar wrote:
It is somewhat difficult to program in assembly-language, because it has very few registers. You can't have both a lot of registers, and a single-byte opcode, because every register that you add needs multiple instructions to work with it, so adding more registers results in running out of room in your 256-slot opcode-map. W65c816 programs overcome the problem of having few registers by relying heavily on direct-page variables --- this works for me because I can have several direct-pages mapped to internal memory for speed.


FWIW: The bytecode in the BCPL system has just 3 x 32-bit registers. 2 work in a push-down stack manner the 3rd can be loaded from the other 2 with a single byte instruction. (There are other registers - a PC, Stack and Globals pointer with the PC and Stack working like the 'internal' ones in the 65xx the globals pointer is a different concept.

So your CPU sounds interesting - but would I then get it to emulate a 65816? No. I'd work on making it emulate the BCPL bytecode directly - cut out the middleman as it were... My RISC-V implementation doesn't use any RAM for the state of the VM/Bytecode - it's all held in registers. This makes it blindingly fast, clock for clock compared to my '816 implementation. The instruction dispatch is 6 cycles in RV land compared to 27 in '816 land. I've thought a lot about trying to implement it in an FPGA, but just don't have the ken or time right now. The Inmos Transputer does come close though - it's architecture is very similar..

-Gordon

_________________
--
Gordon Henderson.
See my Ruby 6502 and 65816 SBC projects here: https://projects.drogon.net/ruby/


Top
 Profile  
Reply with quote  
PostPosted: Fri Apr 28, 2023 9:14 am 
Online
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10986
Location: England
Hugh Aguilar wrote:
I want to emulate the W65c816 on an FPGA processor, but the purpose of this exercise would be to run C code, especially Free-RTOS.


Just to return to this point: you want to design and build a simple CPU on FPGA, in order for it to run an emulation of a processor which has a C compiler. You're thinking of the '816 as the intermediate, emulated, CPU, possibly in a subsetted form. RISC-V has been mentioned as an alternative intermediate but you're not especially keen on it. You'd like the intermediate emulated CPU to have a C compiler already - it doesn't sound like you plan to write your own C compiler or port an existing one.

I'm inclined to agree with Gordon that choosing the 816 as the emulated intermediate is not the way I'd go, but of course it's your project.

There are a few (somewhat) relevant threads on this forum and over on anycpu, which I just wanted to signpost - it might or might not be helpful to the way you'd like to proceed!
The 65816 as the basis for a virtual 16 bit CPU ("65V16")
Announce: Acheron VM "Virtual 16-bit CPU for 6502 computers"
rj16 - a homebrew 16-bit cpu "there's a LCC machine description file, so it has a C compiler"
Building compiler and porting an OS to a new CPU "four months of building a home-built CPU of a home-built RISC ISA, building a home-built C toolchain, and porting Xv6, a Unix-like OS, to that CPU"
xr16 - a tiny RISC, with CPU design tutorial and SoC design by Jan Gray, CPU specifically constructed for ease of writing a C compiler
Big list of CPUs suitable for bootstrapping includes a link to the stage0 project with the Knight architecture

Also notable is Dmitry Grinberg's efforts, where he implements an ARM emulator on a microcontroller in order to run Linux.
Linux on an 8-bit micro? ARM emulated on ATMega
Cortex-M0 emulator on ATTiny85
and subsequently a MIPS emulator for the same reason
My business card runs Linux (and Ultrix), yours can too MIPS emulator on ATSAMD21


Top
 Profile  
Reply with quote  
PostPosted: Fri Apr 28, 2023 9:36 am 
Offline
User avatar

Joined: Wed Feb 14, 2018 2:33 pm
Posts: 1488
Location: Scotland
BigEd wrote:
Hugh Aguilar wrote:
I want to emulate the W65c816 on an FPGA processor, but the purpose of this exercise would be to run C code, especially Free-RTOS.


Just to return to this point: you want to design and build a simple CPU on FPGA, in order for it to run an emulation of a processor which has a C compiler. You're thinking of the '816 as the intermediate, emulated, CPU, possibly in a subsetted form. RISC-V has been mentioned as an alternative intermediate but you're not especially keen on it. You'd like the intermediate emulated CPU to have a C compiler already - it doesn't sound like you plan to write your own C compiler or port an existing one.

I'm inclined to agree with Gordon that choosing the 816 as the emulated intermediate is not the way I'd go, but of course it's your project.


Mulling this over - re. emulating the '816 - I feel it's more because you can still buy real new stock 65C816 CPUs. I'm struggling to think why I'd want to emulate one, but some thoughts do come to mind...

e.g. A single "SoC" with CPU, RAM (say 512KB), Flash, VIA (or VIA-like device) and a UART would be very attractive to me - only for the reasons of being able to run my own system in a smaller footprint, possibly faster (but I'm really of the view that if you want speed, then a 1985 CPU is not the way to do it).

I know from my own brief experiments with FPGAs that a RISC-V CPU, 2MB of RAM, VGA/HDMI video and lots of other "stuff" is achievable on a sub $20 FPGA.

(The Tang Nano is the one I've been experimenting with: https://wiki.sipeed.com/hardware/en/tan ... no-9K.html it was $15 from Aliexpress - Mouser UK has the iCE40HX8K FPGA in a BGA package for £15.00 which would be more than the Tang Nano 9K board when you take postage into account - although I appreciate that not everyone wants to order directly from China via Aliexpress)

Replace that RISC-V core with an 816 core and life ought to be good if it weren't for the fact that RISC-V (or ARM, or another RISC type CPU) would be more efficient and easier to write code for.

But I am interested if you do go down the route of developing your bytecode engine - because one day I'd like to better understand how to do it in an FPGA with that ultimate goal of creating hardware to directly run the BCPL bytecode.

It's reminding me of the Transmeta Crusoe and the Gigatron project...

https://en.wikipedia.org/wiki/Transmeta_Crusoe
https://en.wikipedia.org/wiki/Gigatron_TTL



-Gordon

_________________
--
Gordon Henderson.
See my Ruby 6502 and 65816 SBC projects here: https://projects.drogon.net/ruby/


Top
 Profile  
Reply with quote  
PostPosted: Sat Apr 29, 2023 1:12 am 
Offline

Joined: Fri Jun 03, 2016 3:42 am
Posts: 158
drogon wrote:
So your CPU sounds interesting - but would I then get it to emulate a 65816? No. I'd work on making it emulate the BCPL bytecode directly - cut out the middleman as it were... My RISC-V implementation doesn't use any RAM for the state of the VM/Bytecode - it's all held in registers. This makes it blindingly fast, clock for clock compared to my '816 implementation. The instruction dispatch is 6 cycles in RV land compared to 27 in '816 land.

As I said, the whole purpose of doing emulation in software, is diversity. There can be any number of byte-code VM systems implemented, so everybody gets an opportunity to be creative by designing their own byte-code VM and writing their own compiler. I don't know anything about BCPL, but if you like it, then good for you! Your 6 cycle instruction dispatch on the RISC-V isn't all that impressive. I'm at 5 clock cycles, and that includes getting one 8-bit operand into internal memory.

If you want to be "blindingly fast," the trick is to map some of your memory to internal memory, and have your data-stack in that memory, because most memory-access is to the data-stack. You can also have static variables in that memory (in zero-page). You get about 1KW of internal memory. Accessing memory in external SRAM is slow, so this should be avoided, but of course that is where your heap and your arrays are, so this can't be entirely avoided unless you limit yourself to very small programs.

You can write your byte-code VM on the RISC-V now, or you can emulate it on a desktop computer, then port it over to my processor later (I'm not ready yet for outside contributors).
Here are some tips for how to design your byte-code VM:
  • The processor is word-addressed. You can't easily access individual bytes of data. One reason I liked the W65c816 rather than the MC6809 etc. is that the W65c816 registers are all 16-bit, so I don't have to deal with accessing individual bytes.
  • You get 64KW (not 64KB) of external memory for code and 64KW for data (more banks than that for data with bank-switching, but don't worry about that for now). 64KW for code is a lot, so don't worry about code bloat.
  • The processor is 16-bit, so it grabs a 16-bit word. You can have an 8-bit operand in the high-byte and the byte-code in the low byte.
  • Don't try to pack two byte-codes together into one 16-bit word. This is more complicated than you might suppose! If you don't have an 8-bit operand, just leave the high byte zero and ignore it. If you have a 16-bit operand, you put it in the next word, but you still leave the high byte of your byte-code zero and ignore it. Because the high byte is usually zero and is ignored, your code will seem bloated as compared to an 8-bit processor such as the W65c816 in which everything was packed together. Don't worry about bloat.

So, go ahead with your BCPL compiler. If you keep the above tips in mind, you should be able to port it over to my processor easily.
BTW: My processor is 16-bit. I don't really have much support for 32-bit data, so if you BCPL assumes 32-bit data, that might be a problem. I read about BCPL and it has one data type, which is the word. Make that 16-bit for efficiency.

I'll likely write a W65c816 byte-code VM eventually, but if anybody wants to take a stab at that, go ahead. Keep the above tips in mind. You can't use a legacy W65c816 assembler because it packs the code, but it should be easy enough to modify a legacy W65c816 assembler to insert zeros into the high byte of the byte-code. It would also be a good idea to make it word-addressed rather than byte-addressed so you get the full 64KW. It should be possible to write the assembler in such a way that it will accept legacy W65c816 programs and assemble them to do the same thing that they did on the W65c816 (but the machine-code will be more bloated).

What is the point of your BCPL compiler? This seems like a bad choice for a micro-controller --- isn't this a language developed for desktop computers? --- all that I know about it is what I read on Wikipedia, so I don't know much.
My processor is intended to be used as a micro-controller for machines out in the field (or, at least, the factory floor).


Top
 Profile  
Reply with quote  
PostPosted: Sat Apr 29, 2023 2:09 am 
Offline

Joined: Fri Jun 03, 2016 3:42 am
Posts: 158
Hugh Aguilar wrote:
I'll likely write a W65c816 byte-code VM eventually, but if anybody wants to take a stab at that, go ahead. Keep the above tips in mind. You can't use a legacy W65c816 assembler because it packs the code, but it should be easy enough to modify a legacy W65c816 assembler to insert zeros into the high byte of the byte-code. It would also be a good idea to make it word-addressed rather than byte-addressed so you get the full 64KW. It should be possible to write the assembler in such a way that it will accept legacy W65c816 programs and assemble them to do the same thing that they did on the W65c816 (but the machine-code will be more bloated).

To make the assembler accept legacy W65c816 programs, you are not going to have word-addressing. The legacy code is full of code that adds 2 to an address to get to the next word in memory, and this isn't going to work. You are going to have to have byte-addressing, so you only get 64KB rather than 64KW, but you will never actually access individual bytes because your registers are 16-bit. If you do make it word-addressed, you are going to need Asok the Intern go through the legacy program and change it so that it adds 1 rather than 2 to an address to get to the next word in memory --- this is possible.

The W65c816 is a rather funky design! I think that Apple required them to run legacy 65c02 machine-code unchanged and this is the reason why the design is so funky. If you are going to have 16-bit registers, it would make more sense to have the processor word-addressed rather than byte-addressed so you get 64KW in a bank. Also, this would allow an easy upgrade to a future version of the processor with a 16-bit data-bus --- this would have made the W65c816 competitive with the i8086 and MC68000 that were the mainstream 16-bit processors of that era.

WDC was looking to much to the past, wanting to support 65c02 programs, especially from the Apple-IIc. This was foolishness because the Apple-II line was already dead. They should have been looking forward to the future in which they had to compete against MS-DOS. Of course, if Apple was paying for the W65c816 development, then Apple gets to call the tune.

Of course, the MC68000 was another funky design. It was byte-addressed, but all data had to be even-aligned by the compiler. Wouldn't it have been a lot simpler to make it word-addressed? The MC68008 was a bizarre processor, with 32-bit registers and an 8-bit data-bus. I never programmed the MC68000, but from what I've read about it I think that the i8086 was a better design.


Top
 Profile  
Reply with quote  
PostPosted: Sun Apr 30, 2023 1:02 am 
Offline

Joined: Tue Nov 10, 2015 5:46 am
Posts: 230
Location: Kent, UK
Hugh Aguilar wrote:
You would hate my design! :|
Every opcode is 16-bits. There are no operands except that some instructions have a 9-bit literal embedded in the opcode. Every instruction takes one clock cycle. The goal is to have a 100 Mhz. clock. Making this a "joy" for assembly-language programmers is not a goal --- it is not orthogonal at all; all of the instructions are for specific registers and most of them have side-effects --- it will be difficult to program in.
Designing a programmer-hostile instruction set is nothing to be proud of. It all but guarantees that, even if you complete all the goals of your project, nobody will care.

That you don't understand how instruction sets like MIPS, RISC-V, 68000, and ARM can be joyful to program for suggests a lack of practical experience in writing large amounts of assembly code for a wide range of CPUs.
Quote:
Of course, the MC68000 was another funky design. It was byte-addressed, but all data had to be even-aligned by the compiler. Wouldn't it have been a lot simpler to make it word-addressed?
Instructions with a .b suffix can perfectly well use 8-bit, byte-aligned data. 16- and 32-bit data does have to be aligned, and on many CPUs, so-called "natural" alignment is required. Some CPUs will permit unaligned accesses. Some have special instructions for this. Some will take an exception. For architectures that require it, it's not the burden on the compiler that you appear to think it is.
Quote:
The MC68008 was a bizarre processor, with 32-bit registers and an 8-bit data-bus.
Not a popular choice, but an example of providing a powerful instruction set without requiring a large overall system design.
Quote:
I never programmed the MC68000, [...]
Clearly.

If I may, you seem to have a very 1970s mindset where CPUs are concerned. Your "RISC-V is a toy" comment is quite ignorant. Technology companies around the world are building their own RISC-V cores for their own products, building and selling RISC-V IP for others, supporting open source software, compilers, languages. Companies like AMD and NVIDIA aren't doing this because they've bought into hype. RISC-V cores are uplifing from smaller cores, and/or replacing more expensively licensed cores, while providing 32-bit/64-bit ISAs with first-class compiler support, and first-class open source OS support.
Quote:
The #1 priority in micro-controllers is low interrupt latency.

Says who? I've used embedded CPU cores throughout my career, and interrupt latency has never come up as a serious design point.

It's all application-dependent. What are the events and the deadlines? How is latency measured? What is the speed of the CPU? What is the speed of the memory system? What is the totality of the work/calculation that needs to be done before the deadline?

What if I have a CPU core that has a very short interrupt-to-first instruction latency, but the instruction set is so weak it takes 10x the instructions to do the work that needs to be done?

Your comment in isolation is meaningless. It's like you're repeating a line you read in a book.

For someone with clearly strong opinions on both system design and language design, I think it would be worth your time to study, and practically experiment with, more modern CPUs. I feel you're somewhat hampered by a mindset stuck in the past.

That said, this is a somewhat "stuck in the past" site, and we all have a soft spot for the 6502... but there's a difference between having a fondness for a CPU you used in your youth, and thinking that it was the peak of computing greatness and all advances since then have been solving the wrong problems.

Don't let any of this dissuade you from proceeding with your plans: To build a core that runs at 100MHz, with a hostile instruction set that you can use to emulate a 65C816... A CPU that very few people have heard of, let alone care about. It seems absurdly pointless, but sometimes fun endeavors are pointless to everyone except the person enjoying them.

Have fun with your project. That's the most important thing. And post updates!


Top
 Profile  
Reply with quote  
PostPosted: Sun Apr 30, 2023 4:30 pm 
Offline
User avatar

Joined: Wed Feb 14, 2018 2:33 pm
Posts: 1488
Location: Scotland
Hugh Aguilar wrote:
drogon wrote:
So your CPU sounds interesting - but would I then get it to emulate a 65816? No. I'd work on making it emulate the BCPL bytecode directly - cut out the middleman as it were... My RISC-V implementation doesn't use any RAM for the state of the VM/Bytecode - it's all held in registers. This makes it blindingly fast, clock for clock compared to my '816 implementation. The instruction dispatch is 6 cycles in RV land compared to 27 in '816 land.

As I said, the whole purpose of doing emulation in software, is diversity. There can be any number of byte-code VM systems implemented, so everybody gets an opportunity to be creative by designing their own byte-code VM and writing their own compiler. I don't know anything about BCPL, but if you like it, then good for you! Your 6 cycle instruction dispatch on the RISC-V isn't all that impressive. I'm at 5 clock cycles, and that includes getting one 8-bit operand into internal memory.

If you want to be "blindingly fast," the trick is to map some of your memory to internal memory, and have your data-stack in that memory, because most memory-access is to the data-stack. You can also have static variables in that memory (in zero-page). You get about 1KW of internal memory. Accessing memory in external SRAM is slow, so this should be avoided, but of course that is where your heap and your arrays are, so this can't be entirely avoided unless you limit yourself to very small programs.

You can write your byte-code VM on the RISC-V now, or you can emulate it on a desktop computer, then port it over to my processor later (I'm not ready yet for outside contributors).
Here are some tips for how to design your byte-code VM:
  • The processor is word-addressed. You can't easily access individual bytes of data. One reason I liked the W65c816 rather than the MC6809 etc. is that the W65c816 registers are all 16-bit, so I don't have to deal with accessing individual bytes.
  • You get 64KW (not 64KB) of external memory for code and 64KW for data (more banks than that for data with bank-switching, but don't worry about that for now). 64KW for code is a lot, so don't worry about code bloat.
  • The processor is 16-bit, so it grabs a 16-bit word. You can have an 8-bit operand in the high-byte and the byte-code in the low byte.
  • Don't try to pack two byte-codes together into one 16-bit word. This is more complicated than you might suppose! If you don't have an 8-bit operand, just leave the high byte zero and ignore it. If you have a 16-bit operand, you put it in the next word, but you still leave the high byte of your byte-code zero and ignore it. Because the high byte is usually zero and is ignored, your code will seem bloated as compared to an 8-bit processor such as the W65c816 in which everything was packed together. Don't worry about bloat.

So, go ahead with your BCPL compiler. If you keep the above tips in mind, you should be able to port it over to my processor easily.
BTW: My processor is 16-bit. I don't really have much support for 32-bit data, so if you BCPL assumes 32-bit data, that might be a problem. I read about BCPL and it has one data type, which is the word. Make that 16-bit for efficiency.

I'll likely write a W65c816 byte-code VM eventually, but if anybody wants to take a stab at that, go ahead. Keep the above tips in mind. You can't use a legacy W65c816 assembler because it packs the code, but it should be easy enough to modify a legacy W65c816 assembler to insert zeros into the high byte of the byte-code. It would also be a good idea to make it word-addressed rather than byte-addressed so you get the full 64KW. It should be possible to write the assembler in such a way that it will accept legacy W65c816 programs and assemble them to do the same thing that they did on the W65c816 (but the machine-code will be more bloated).

What is the point of your BCPL compiler? This seems like a bad choice for a micro-controller --- isn't this a language developed for desktop computers? --- all that I know about it is what I read on Wikipedia, so I don't know much.
My processor is intended to be used as a micro-controller for machines out in the field (or, at least, the factory floor).


It's a lot to think about, but here are some answers to the questions you pose ... Possibly not in any good order, but I'll start with BCPL.

BCPL is a high level "algol-like" language that was designed round about 1966. It's very well established, but also almost completely moribund, however the original compiler is still being developed by the original creator and he released a new version just last year. It can output various forms of code and the form I'm using is one called CINTCODE (Compact INTermediate Code). It's a bytecode and quite CISC in operation. Highly tuned by analysing the output of the compiler compiling itself, make the more common opcodes shorter, etc. and this was performed over a period of time. It's sometimes said that the BCPL compiler was designed for just one thing - writing a BCPL compiler! However it was used to develop B which was then used to bootstrap early C and the rest, as they say, is history...

So why BCPL for me? It is the only high-level compiled language that today can work in a self-hosting 65xx environment. I can edit, compile and run BCPL programs directly on my 65816 system with nothing more than a serial terminal. The editor is written in BCPL, the compiler in BCPL and my operating system - it's a single-user multi-tasking OS written in ... BCPL.

The bytecode VM/interpreter - it's written in 65816 assembly language. It's some 16,000 bytes of hand-written '816 assembly language, supported by macros.

The bytecode VM requires a machine operating system to provide it with boring stuff like IO, serial, disk, etc. and this is written in about 10KB of hand-written (mostly) 65C02 code, supported by macros.

This exists today. I don't need to write it - I've written it. It work, and runs. It runs OK, but as it's a 32-bit VM running on a 16 bit CPU with an 8-bit memory interface it's not going to win awards for speed.

So you think BCPL is a bad choice? It's the only choice today for a self-hosting system with a high level language compiler and that was my aim.

(And here I mean other than Basic or Forth systems)

Today there are NO C compilers that I can take and use directly on a "retro-new" 65C816 system.


There are C compilers that were developed in the past and ran on such systems - Aztec C on the Apple II, there is a C compiler for the Apple IIgs and a variant of TinyC for the BBC Micro, but to my knowledge there is no other C compiler that I can run directly on a 65xx system - or not one I could get the source code for to adapt for my own system. Someone please prove me wrong!

Also BCPL - I have used it a lot in the past. Back in the early 80's I developed a lot of code for a distributed manufacturing system in BCPL - it ran on BBC Micros and used networking and a central file store.

So that's my aims and goals: Create a retro self-hosting and I feel I've succeeded. Now I want more and like we did in the past, when we wanted bigger, better, faster... I am itching for the same. In 1985 when the '816 came out, arguably even then it was too little, too late. Acorn and Apple did make systems with it but they were niche products and they moved on.

However for various reasons you can still buy the W65C816 new today, so emulating one is puzzling to me. I could see the advantage of emulating one in software, and sometimes I wish I'd done that before I embarked on my current project, but hey, ho, I built real hardware based on my existing 65C02 systems and got on with it.

I don't consider the '816 to be a microcontroller either. It's A CPU - a Microprocessor.. A microcontroller has more stuff on-board, typically flash, RAM and a veritable plethora of IO. Those are typically additional ICs required in a µP system, but all part of the same chip in a µC system.

And I am looking at moving to RISC-V in the same way DEC moved to VAX, Apple moved to the 68K and Acorn moved to ARM - because - bigger, better, faster. Actually, I want sustainability too.

RISC-V is not new. It can trace its origins back to the early 1980s in Berkley and one of the first commercial applications was the Sun Sparc processor. (Which I also wrote a lot of code for - another "joy" CPU to code for).

To get to grips with modern RISC-V, I wrote an emulator for it - in BCPL. It runs at approximately 2000 RV instructions/second - not bad for 32-bit VM interpreted on a 16-bit CPU with an 8-bit memory interface at 16Mhz. It runs well enough to bootstrap my entire BCPL operating system inside itself. I'll do a video of that one day. It's turtles all the way down, as they say.

So that's my system - one goal I have is one day, maybe, being able to have hardware directly execute the CINTCODE bytecode system and that's the reason I'm curious about your system. I would need some 512KB of RAM though - the compiler has become somewhat bloated over the years and now needs nearly 50KB of RAM to load and over 200KB of RAM for data. Such is the sign of the times.

Based on writing a bytecode VM in '816 assembler, I have some issues with some of your ideas though.

One is that you seem to be a little naive about the concept of the byte - suggesting that loading a 16-bit word is more efficient - maybe. In some cases yes, but lets look at your initial target - the w65c816. It may well be considered a bytecode in that each instruction is just one byte long, but the operands - they vary from zero to 3 bytes. 0 bytes: NOP, TXA and so on. 1 byte: LDA #$42 (in 8-bit memory size), 2 bytes, LDA #$42 (in 16-bit memory size), 3 bytes: LDA [abs24] ... So while doing a 16-bit read might seem good, it's not always going to be optimal and you can never guarantee that '816 instructions (or any other bytecode) will be aligned.

(Unless you re-write the assembler)

The CINTCODE bytecode is similar - one byte opcodes (255 of them) and variable byte operands from 0 to many. 0 byte examples are Load small constant, (10 <= c >= -1), Add register A to register B, leave result in register A. Fetch value from stack position X (X < 15). 1 byte operand - Load byte constant, Load value from stack offset, call procedure with byte offset, etc. 2,3 byte operands is for larger data - load halfword (16-bits), load word (32-bits) and so on. Switch instructions are special in that they have a balanced binary tree of values/jumps (fast, longer lists) or just a simple list of values and jumps (if/then/else style - smaller lists - the compiler works out which is best).

So being able to efficiently pick a byte (opcode) out from any byte address in RAM with data (operand) in any byte aligned address in RAM is crucial for a good bytecode engine.

On memory size: I need more than 128KB of RAM. Why did the '816 exists? Because it breaks the 64KB limitation of the 6502 - the promise was MB of RAM. 24-bit address bus is up to 16MB of RAM. Of-course in my system I use almost all the first 64K for the machine OS, the VM interpreter, stacks and the BCPL global vectors. Some of this might be saved should the actual VM engine be in microcode of some sort.

Quote:
I'll likely write a W65c816 byte-code VM eventually, but if anybody wants to take a stab at that, go ahead. Keep the above tips in mind. You can't use a legacy W65c816 assembler because it packs the code, but it should be easy enough to modify a legacy W65c816 assembler to insert zeros into the high byte of the byte-code. It would also be a good idea to make it word-addressed rather than byte-addressed so you get the full 64KW. It should be possible to write the assembler in such a way that it will accept legacy W65c816 programs and assemble them to do the same thing that they did on the W65c816 (but the machine-code will be more bloated).


I'm confused by this. You started off about 65816 C compilers. To achieve the goal above, YOU will need to modify the assembler to produce code that's not quite 65816. It would be 65816 code where every opcode is 16-bit word aligned. You are giving yourself a lot of work to do. No-one else will do this for you. I won't because I have real working 65816 CPUs and tools that already work.

And you mention the '816 registers being 16-bits and that your chosen emulation would not support 8-bit register sizes... Well this is both good and bad. Writing a bytecode interpreter in '816 code is bad - there is no way to directly load an 8-bit value from RAM into a 16-bit register and zero the top 8-bits. You can dance round it by using an index register, dropping to 8-bit register size and so on - this adds cycles. In my cintcode VM, I keep the memory in 16-bit size, load the byte, (2 memory accesses as it loads 2 x 8-bit values), mask the top byte (another 3 cycles) the I can use the byte to index and jump. It wastes 4 cycles. Fortunately in RV land it can load a byte into a 32-bit register and zero the top 24-bits directly.

Maybe also have a look at the history of the early Prime minicomputers - they were originally 16-bit and lacked byte addressing. It was a long time before they got a C compiler and eventually Unix, but like many, too little too late by then.

And don't worry about BCPL being 32-bit. It can be 16,32 or 64 but the ability to do multi-byte arithmetic is what enables us to handle larger data values than the underlying hardware supports. BBC Basic has 32-bit integers on the 8-bit 6502, so dealing with 32-bit values on the 16 bit '816 is trivial.

But this is your project - your ideas - your goals. I do hope they work for you - I am concerned that it's not a good way to do stuff, but that's my concerns.

Hope it works and please do keep us posted.

-Gordon

_________________
--
Gordon Henderson.
See my Ruby 6502 and 65816 SBC projects here: https://projects.drogon.net/ruby/


Top
 Profile  
Reply with quote  
PostPosted: Sun Apr 30, 2023 5:08 pm 
Offline

Joined: Tue Nov 10, 2015 5:46 am
Posts: 230
Location: Kent, UK
drogon wrote:
To get to grips with modern RISC-V, I wrote an emulator for it - in BCPL. It runs at approximately 2000 RV instructions/second - not bad for 32-bit VM interpreted on a 16-bit CPU with an 8-bit memory interface at 16Mhz. It runs well enough to bootstrap my entire BCPL operating system inside itself. I'll do a video of that one day. It's turtles all the way down, as they say.
That's mental. I love it! The epitome of doing rather than talking.


Top
 Profile  
Reply with quote  
PostPosted: Sun Apr 30, 2023 5:21 pm 
Online
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10986
Location: England
I'd watch that video! I'd like and subscribe!


Top
 Profile  
Reply with quote  
PostPosted: Mon May 01, 2023 12:18 am 
Offline

Joined: Thu Mar 03, 2011 5:56 pm
Posts: 284
drogon wrote:

There are C compilers that were developed in the past and ran on such systems - Aztec C on the Apple II, there is a C compiler for the Apple IIgs and a variant of TinyC for the BBC Micro, but to my knowledge there is no other C compiler that I can run directly on a 65xx system - or not one I could get the source code for to adapt for my own system. Someone please prove me wrong!


The source code for Orca/C is available: https://github.com/byteworksinc/ORCA-C, although this is written in Orca/Pascal and assembler (probably Orca/M). See also https://juiced.gs/store/opus-ii-software/.

Not sure if this can be ported to another platform with a reasonable(*) amount of effort, though...

(*) For a reasonable value of "reasonable".


Top
 Profile  
Reply with quote  
PostPosted: Mon May 01, 2023 8:58 am 
Offline
User avatar

Joined: Wed Feb 14, 2018 2:33 pm
Posts: 1488
Location: Scotland
rwiker wrote:
drogon wrote:

There are C compilers that were developed in the past and ran on such systems - Aztec C on the Apple II, there is a C compiler for the Apple IIgs and a variant of TinyC for the BBC Micro, but to my knowledge there is no other C compiler that I can run directly on a 65xx system - or not one I could get the source code for to adapt for my own system. Someone please prove me wrong!


The source code for Orca/C is available: https://github.com/byteworksinc/ORCA-C, although this is written in Orca/Pascal and assembler (probably Orca/M). See also https://juiced.gs/store/opus-ii-software/.

Not sure if this can be ported to another platform with a reasonable(*) amount of effort, though...

(*) For a reasonable value of "reasonable".


Yes - I have seen that, but first get an Apple IIgs, then get Pascal going then get C going... At that point "reasonable" starts to be somewhat far away. It also relies on a lot of Apple IIgs internals, etc. Probably not impossible, but effort vs. time, etc.

Cheers,

-Gordon

_________________
--
Gordon Henderson.
See my Ruby 6502 and 65816 SBC projects here: https://projects.drogon.net/ruby/


Top
 Profile  
Reply with quote  
PostPosted: Mon May 01, 2023 12:29 pm 
Offline
User avatar

Joined: Fri Aug 03, 2018 8:52 am
Posts: 746
Location: Germany
hmm, a native 65816 compiler would be sick.
I do know that a 65816 target for vbcc is being worked on (though i don't know how far along that project is).
And lcc already has a 65816 target, and i was recently able to build it on windows (which was a pain, moving from MSVC to mingw64, and fixing missing commands in the makefile).
so in theory, you would need a few custom "std" libraries (stdio, stdlib, string, etc) for your 65816 System, and then you could try to compile one of those compilers with itself.

the resulting binary would likely take up a huge chunk of memory and be pretty slow... but it would work. you could then go back and start replacing some functions with smaller assembly equivalents to try and reduce the size.

but ideally you'd want a compiler specifically written for the 65816 so it would be as small as possible from the start.
I don't know how large that Apple IIgs C compiler is, but i assume it's not that much.
there also seem to be atleast a few utilites online that claim to be able to convert Pascal code into C code. so using one of those, it might be feasible to port the whole thing to C, at which point you can then use a compiler like Calypsi C or lcc to make it native for the 65816 (also maybe compile it for Windows/Linux to add yet another open source entry to the very small list of 65816 C compilers).


Top
 Profile  
Reply with quote  
PostPosted: Mon May 01, 2023 7:19 pm 
Offline
User avatar

Joined: Wed Feb 14, 2018 2:33 pm
Posts: 1488
Location: Scotland
Proxy wrote:
hmm, a native 65816 compiler would be sick.
I do know that a 65816 target for vbcc is being worked on (though i don't know how far along that project is).
And lcc already has a 65816 target, and i was recently able to build it on windows (which was a pain, moving from MSVC to mingw64, and fixing missing commands in the makefile).
so in theory, you would need a few custom "std" libraries (stdio, stdlib, string, etc) for your 65816 System, and then you could try to compile one of those compilers with itself.


Yes, there are existing compilers but they're cross compilers, requiring a Win/Mac/Linux system to run. However that may not be an issue for some (even the majority?) Today we're moving to a system where we simply drag & drop compiled code into what's essentially a file system (USB bulk storage) and it 'magically' sends that code to the target where it's run - no programmer, no OS, no libraries required as such as each time you send a whole image for it to run. See e.g. the BBC Micro:Bit for an example, Pi Pico, ESP/Ardunio devices and so on. There is even right now someone who's promised to make a 6502 system like this - hosted by a Pi Pico acting as it's RAM/ROM/IO/Clock, etc. Just add a 6502 and off you go.

Oddly, the standard library is probably the least of your worries - you really don't need much to make a C system run - I'm able to cross-compile trivial C programs for my RISC-V emulator and run them on the '816 that way... and there is a thing called "newlib" which is a complete standard C library replacement - I'm using it with my stand alone/baremetal OS on the Raspberry Pi right now...

Quote:
the resulting binary would likely take up a huge chunk of memory and be pretty slow... but it would work. you could then go back and start replacing some functions with smaller assembly equivalents to try and reduce the size.

but ideally you'd want a compiler specifically written for the 65816 so it would be as small as possible from the start.


I only have experience of Aztec C on the Apple II - it was very usable and seemed to generate OK-ish code, but the edit/compile/test & run cycle was very slow (One reason we moved to BCPL on BBC Micros in this place!)

And of-course, you need some sort of underlying OS to host it all too. ...

Quote:
I don't know how large that Apple IIgs C compiler is, but i assume it's not that much.
there also seem to be atleast a few utilites online that claim to be able to convert Pascal code into C code. so using one of those, it might be feasible to port the whole thing to C, at which point you can then use a compiler like Calypsi C or lcc to make it native for the 65816 (also maybe compile it for Windows/Linux to add yet another open source entry to the very small list of 65816 C compilers).


I have looked at Pascal - and looked at writing a Pascal compiler to target the bytecode in my BCPL system - it would be very compact code, but slow.. but then again UCSD Pascal wasn't a speed demon, but it got the job done... I have also looked at re-targetting an existing C compiler, again to the BCPL bytecode but for me, it's a time and energy issue and at the end of the day.. BCPL is just fine for me - for now. I have my OS, basic utilities and so on.

-Gordon

_________________
--
Gordon Henderson.
See my Ruby 6502 and 65816 SBC projects here: https://projects.drogon.net/ruby/


Top
 Profile  
Reply with quote  
PostPosted: Mon May 01, 2023 8:28 pm 
Offline
User avatar

Joined: Fri Aug 03, 2018 8:52 am
Posts: 746
Location: Germany
any cross-compiler can become a native compiler when you compile it's own source code with itself.
that's what i meant. so for example you would compile lcc for windows/linux, and then use that compiler to compile its own source code with the target being the 65816. which leaves you with a compiler that runs on the 65816 and generates 65816 binaries.

and while you likely don't need a full blown OS, you would need something on the level of DOS... just something that allows you to browse files and run commands/programs (with arguments). probably through a serial terminal.

of course that still takes a lot of effort, but if pulled off you'd have a complete development environment directly on the 65816 (well once you add a text editor to actually write source files)


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 54 posts ]  Go to page Previous  1, 2, 3, 4  Next

All times are UTC


Who is online

Users browsing this forum: No registered users and 17 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: