6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Thu May 09, 2024 2:33 pm

All times are UTC




Post new topic Reply to topic  [ 54 posts ]  Go to page Previous  1, 2, 3, 4
Author Message
PostPosted: Mon May 01, 2023 8:52 pm 
Offline
User avatar

Joined: Wed Feb 14, 2018 2:33 pm
Posts: 1407
Location: Scotland
Proxy wrote:
any cross-compiler can become a native compiler when you compile it's own source code with itself.
that's what i meant. so for example you would compile lcc for windows/linux, and then use that compiler to compile its own source code with the target being the 65816. which leaves you with a compiler that runs on the 65816 and generates 65816 binaries.


I tried this with cc65. OK, that's a 6502 compiler and not an '816, however... It couldn't compile itself - too large. And even if it could - the binary for x86_64 is half a MB, I imagine the code density is worse on the '816 so getting change out of 1MB may be a challenge... But RAM is cheap as is storage these days.

I fear that a native C compiler would be a little reduced - like TinyC, SmallC, etc. but it may be worthwhile looking... One day!


Quote:
and while you likely don't need a full blown OS, you would need something on the level of DOS... just something that allows you to browse files and run commands/programs (with arguments). probably through a serial terminal.

of course that still takes a lot of effort, but if pulled off you'd have a complete development environment directly on the 65816 (well once you add a text editor to actually write source files)


That's essentially what RubyOS is - although the CLI is written in BCPL - it could be C and I do have a C version of my nano-like editor that runs on the board in 6502 mode - 1500 lines of C compiles to just over 15KB of 6502 code.

-Gordon

_________________
--
Gordon Henderson.
See my Ruby 6502 and 65816 SBC projects here: https://projects.drogon.net/ruby/


Top
 Profile  
Reply with quote  
PostPosted: Wed May 03, 2023 5:03 am 
Offline
User avatar

Joined: Tue Feb 28, 2023 11:39 pm
Posts: 142
Location: Texas
Proxy wrote:
the resulting binary would likely take up a huge chunk of memory and be pretty slow... but it would work. you could then go back and start replacing some functions with smaller assembly equivalents to try and reduce the size.


Most modern compilers these days try to do all passes in one large go; which was not possible in the days of less memory. You'd likely be better off designing a compiler that has multiple stages that can pick up where the last one left off.

drogon wrote:
Proxy wrote:
any cross-compiler can become a native compiler when you compile it's own source code with itself.
that's what i meant. so for example you would compile lcc for windows/linux, and then use that compiler to compile its own source code with the target being the 65816. which leaves you with a compiler that runs on the 65816 and generates 65816 binaries.


I tried this with cc65. OK, that's a 6502 compiler and not an '816, however... It couldn't compile itself - too large. And even if it could - the binary for x86_64 is half a MB, I imagine the code density is worse on the '816 so getting change out of 1MB may be a challenge... But RAM is cheap as is storage these days.


Comparing an x86_64 binary size to that of a 6502 or 65816 isn't a very fair comparison. A single pointer on x64 is 4 times as big after all; let alone some of the crazy instructions they've added to the x86 platform over the years. (I was looking through it, they have an instructions specifically for doing AES cyphers now...... >_< )

drogon wrote:
I fear that a native C compiler would be a little reduced - like TinyC, SmallC, etc. but it may be worthwhile looking... One day!


Maybe, maybe not? It really would depend on which revision of C you go with, and how you structured the code. The first publicly released MS compiler fit on 3x 360Kbyte floppies, and you can see how MS split the executable up into smaller chunks to make it fit in the available RAM of the time of the 80286.

This version was compatible with K&R C, but I don't think you'd be hard pressed to get C89 into about the same memory footprint.

Most of the revisions to C add a few basic QOL stuff to the language (C++ style end of line comments and wide character types for example), after that the larger part of the changes are to some of the standard libraries and predefined macros. Nothing terribly complex, though I am just skimming over some of the changes for each revision as I type this.

Trying to get all that working on a 6502 might be a trick with the 64KB address limit. The 65816 should be able to do it provided it has about the same range of memory you'd typically find on a 286 of the time.

Maybe you could do with even less with using disk space for scratch/swap space?

That being said, TinyC, SmallC and lcc are mostly monolithic designs, intended to be run on systems with copious amounts of RAM available to them. You'd want to split up the stage1 and stage2 passes in those compilers to get them to fit.

As for the standard headers that starts getting into the realm of Posix. If you're designing something like an OS you can pretty much forget that these even exist; you end up having to implement your own versions for your kernel, and even then most Un*x like OSes only implement what they absolutely need in their kernel and little else. Usually this is split into a special libk which is a subset of libc so it's smaller and can be statically linked into the kernel.

For me and the toy OS that I've been working on I've generally found that I just need just a very small subset of what these libraries offer to get anything functional. The most elaborate thing I have in my code is my own hacked version printf() largely based on the one in FreeBSD. After that I mostly just need some of the string functions such as memcpy() and friends.

Things like the stdio FILE are nice to have, but aren't something strictly required for a functional C program. It's really only useful when you want to port the code from one OS to another and you want to avoid using platform specific functions. If you're writing a compiler specifically for the 65xx that you intend to run on that CPU then you can work around this with some wrapper/stub functions. One version that calls the FILE version for all other OSes, and another that just jumps to ROM or whatever function table for your 65xx OS/monitor/whatever.


Top
 Profile  
Reply with quote  
PostPosted: Wed May 03, 2023 9:22 pm 
Offline

Joined: Fri Jun 03, 2016 3:42 am
Posts: 158
drogon wrote:
BCPL is a high level "algol-like" language that was designed round about 1966. It's very well established, but also almost completely moribund, however the original compiler is still being developed by the original creator and he released a new version just last year. It can output various forms of code and the form I'm using is one called CINTCODE (Compact INTermediate Code). It's a bytecode and quite CISC in operation. Highly tuned by analysing the output of the compiler compiling itself, make the more common opcodes shorter, etc. and this was performed over a period of time. It's sometimes said that the BCPL compiler was designed for just one thing - writing a BCPL compiler! However it was used to develop B which was then used to bootstrap early C and the rest, as they say, is history...

So why BCPL for me? It is the only high-level compiled language that today can work in a self-hosting 65xx environment. I can edit, compile and run BCPL programs directly on my 65816 system with nothing more than a serial terminal. The editor is written in BCPL, the compiler in BCPL and my operating system - it's a single-user multi-tasking OS written in ... BCPL.
...
So you think BCPL is a bad choice? It's the only choice today for a self-hosting system with a high level language compiler and that was my aim.

I said BCPL seemed like a bad choice. I also said that I didn't know anything about BCPL (Wikipedia is not worth much). I'm not opposed to BCPL.

The whole point of having a processor with support built-in for emulating a byte-code VM is diversity. I want people to develop multiple byte-code VM systems. This is an opportunity to be creative! Design a byte-code VM and write a compile, all of your own. Or, go retro and emulate some existing system (what you seem to be doing with CINTCODE).

I was only interested in the W65c816 as a way to get C running on the processor. I could write my Forth for a W65c816 that has been updated with a few instructions to support Forth, but it is still not that good of a Forth target, so I'm better off sticking with my own byte-code VM that is designed specifically for Forth. I mostly just want to get C running because the majority of people demand C.

Somebody (it may have been you) asked what the point would be of emulating the W65c816 when it is still possible to buy W65c816 chips and boards. There are reasons:
  • Speed. A W65c816 chip runs at maybe 12 Mhz.. If my processor is running at 100 Mhz. it should be able to emulate the W65c816 at the speed of a 20 Mhz. W65c816 chip. This is just conjecture because I'm not there yet, but it seems reasonable. Several of the W65c816 direct-pages will be mapped to the FPGA's internal memory for speed.
  • Versatility. An FPGA can be reconfigured for a variety of I/O for a variety of applications. WDC is not this flexible. I'm hoping to get audio and video comparable to the Commodore-64's SID and VIC-II chips. That would be awesome! I'm also aiming for the least-expensive FPGA chip available. Getting both awesomeness and low-cost might be a lot to ask for, but that is the idea.
drogon wrote:
However for various reasons you can still buy the W65C816 new today, so emulating one is puzzling to me. I could see the advantage of emulating one in software, and sometimes I wish I'd done that before I embarked on my current project, but hey, ho, I built real hardware based on my existing 65C02 systems and got on with it.

I don't consider the '816 to be a microcontroller either. It's A CPU - a Microprocessor.. A microcontroller has more stuff on-board, typically flash, RAM and a veritable plethora of IO. Those are typically additional ICs required in a µP system, but all part of the same chip in a µC system.

Okay, here you are talking about the versatility issue that I mentioned above.

drogon wrote:
To get to grips with modern RISC-V, I wrote an emulator for it - in BCPL. It runs at approximately 2000 RV instructions/second - not bad for 32-bit VM interpreted on a 16-bit CPU with an 8-bit memory interface at 16Mhz. It runs well enough to bootstrap my entire BCPL operating system inside itself. I'll do a video of that one day. It's turtles all the way down, as they say.

I'm working on my assembler/simulator for my processor. I've done this before on the 65c02 on the Apple-II (actually the Laser-128 clone). It did source-level debugging on the MS-DOS via an RS232 cable. I used this to write my symbolic math program (it could do derivatives, but I never got as far as doing integrals). My Forth was derived from ISYS Forth but I had a cross-compiler running under MS-DOS. I doubt that there was any C development system that would have been capable of a program like this.

drogon wrote:
So that's my system - one goal I have is one day, maybe, being able to have hardware directly execute the CINTCODE bytecode system and that's the reason I'm curious about your system. I would need some 512KB of RAM though - the compiler has become somewhat bloated over the years and now needs nearly 50KB of RAM to load and over 200KB of RAM for data. Such is the sign of the times.

A big part of why I want to support a byte-code VM is that this will be in external memory, so your 1/2 MB requirement would be realistic. Internal memory is faster, but it is also very limited in size.

drogon wrote:
Based on writing a bytecode VM in '816 assembler, I have some issues with some of your ideas though.

One is that you seem to be a little naive about the concept of the byte - suggesting that loading a 16-bit word is more efficient - maybe. In some cases yes, but lets look at your initial target - the w65c816. It may well be considered a bytecode in that each instruction is just one byte long, but the operands - they vary from zero to 3 bytes. 0 bytes: NOP, TXA and so on. 1 byte: LDA #$42 (in 8-bit memory size), 2 bytes, LDA #$42 (in 16-bit memory size), 3 bytes: LDA [abs24] ... So while doing a 16-bit read might seem good, it's not always going to be optimal and you can never guarantee that '816 instructions (or any other bytecode) will be aligned.

(Unless you re-write the assembler)

The CINTCODE bytecode is similar - one byte opcodes (255 of them) and variable byte operands from 0 to many. 0 byte examples are Load small constant, (10 <= c >= -1), Add register A to register B, leave result in register A. Fetch value from stack position X (X < 15). 1 byte operand - Load byte constant, Load value from stack offset, call procedure with byte offset, etc. 2,3 byte operands is for larger data - load halfword (16-bits), load word (32-bits) and so on. Switch instructions are special in that they have a balanced binary tree of values/jumps (fast, longer lists) or just a simple list of values and jumps (if/then/else style - smaller lists - the compiler works out which is best).

So being able to efficiently pick a byte (opcode) out from any byte address in RAM with data (operand) in any byte aligned address in RAM is crucial for a good bytecode engine.

I don't think that I'm naive about the concept of a byte. :roll:
It is just that, designing this thing, I found that it is much easier to make it efficient with the limitations I described.

I could make it work with packed W65c816 code, or your CINTCODE that is also apparently packed. To do this I would need something similar to the prefetch queue of the i8086. This would provide me with a queue of bytes that have now been unpacked with one byte per word. This is possible. This would require some more support from the FPGA to fill the prefetch-queue and unpack the bytes in the background. It might be worthwhile, but I don't need that for my own byte-code VM design.

As for a SWITCH statement, I already have support for a 256-vector jump-table. That is how the byte-code VM works. I currently only support one jump-table though, so I don't have support for a SWITCH statement. If there is any call for a SWITCH statement, I could provide this. Keep in mind though that a 256-vector jump-table consumes 1/4 KW of internal memory, and there isn't a lot available.

What I already have should support a jump-table in external memory for a SWITCH statement. This might be fast enough. What do you need a SWITCH statement for? All of these design decisions are trade-offs. My plan is to start with a simple system and, if certain applications need features that I don't have, I will provide those features if this can be reasonably done. I don't want to predict ahead of time what features are possibly needed and try to provide them all.

sark02 wrote:
Designing a programmer-hostile instruction set is nothing to be proud of. It all but guarantees that, even if you complete all the goals of your project, nobody will care.
...
That you don't understand how instruction sets like MIPS, RISC-V, 68000, and ARM can be joyful to program for suggests a lack of practical experience in writing large amounts of assembly code for a wide range of CPUs.

Nobody writes large amounts of assembly code any more. For one thing, my FPGA will only provide 6KW of code-memory and 2KW of data-memory. How large of a program can you write in this limited space?

People write large programs in high-level languages. That is the purpose of supporting a byte-code VM! They can be joyful enough doing this, and they can have a MB of memory to be joyful in.

Somebody smart will have to write the byte-code VM primitives in assembly-language, and will have to write the ISRs in assembly-language. Apparently a super-duper assembly-language expert such as yourself would distain of such low-level programming because it is just not joyful enough. I'm doing this right now for my own byte-code VM design. Drogon may do it for his CINTCODE. I've written much more low-level assembly-language in the past. I figured it out. For me there is joy enough in accomplishment.


Top
 Profile  
Reply with quote  
PostPosted: Thu May 04, 2023 3:06 am 
Offline

Joined: Fri Jun 03, 2016 3:42 am
Posts: 158
sark02 wrote:
Hugh Aguilar wrote:
I never programmed the MC68000, [...]

Clearly.

I read about the MC68000 but it didn't seem interesting.

In Forth it is common to keep the top element of the data-stack (the TOS in Forth terminology) in a register. On the i8086 this register was typically BX.

The MC68000 has a distinction between registers used for data and registers used for pointers. So, should a D-register or an A-register be used as the TOS? The TOS is a pointer for instructions such as @ and ! but is data for instructions such as + etc.. I thought that the i8086 was a better design for Forth.
I also thought that, if your data was 16-bit, there was no advantage in having 32-bit registers --- but, 32-bit data read or write obviously takes twice as much time as 16-bit data read or write given a 16-bit data-bus. The i8086 breaks out of the 64KB limit of the Z80, but it does so with 16-bit registers, which seemed clever to me.
sark02 wrote:
If I may, you seem to have a very 1970s mindset where CPUs are concerned.

Actually, you may not tell me that I have a 1970s mindset. That is just a put-down.
sark02 wrote:
Hugh Aguilar wrote:
The #1 priority in micro-controllers is low interrupt latency.

Says who? I've used embedded CPU cores throughout my career, and interrupt latency has never come up as a serious design point.

It's all application-dependent. What are the events and the deadlines? How is latency measured? What is the speed of the CPU? What is the speed of the memory system? What is the totality of the work/calculation that needs to be done before the deadline?

What if I have a CPU core that has a very short interrupt-to-first instruction latency, but the instruction set is so weak it takes 10x the instructions to do the work that needs to be done?

Your comment in isolation is meaningless. It's like you're repeating a line you read in a book.

You are saying that I'm just a stupid little student who has never written an assembly-language program, but goes on internet forums quoting from textbooks and pretending to be a programmer. That is a gross put-down.

You are telling me that I hadn't considered that a very short interrupt-to-first instruction latency doesn't help if the instruction set is so weak it takes 10x the instructions to do the work that needs to be done. Who would have thunk???
sark02 wrote:
That said, this is a somewhat "stuck in the past" site, and we all have a soft spot for the 6502... but there's a difference between having a fondness for a CPU you used in your youth, and thinking that it was the peak of computing greatness and all advances since then have been solving the wrong problems.

I didn't say that the 6502 was the peak of computing greatness and all advances since then have been solving the wrong problems. You are painting a picture of me as an utter fool.
sark02 wrote:
Don't let any of this dissuade you from proceeding with your plans: To build a core that runs at 100MHz, with a hostile instruction set that you can use to emulate a 65C816... A CPU that very few people have heard of, let alone care about. It seems absurdly pointless, but sometimes fun endeavors are pointless to everyone except the person enjoying them.

You are saying that the only point of the project is to emulate the W65c816, although I clearly said that the point was to support multiple byte-code VMs --- my own that is mostly for Forth, possibly the W65c816 that would be for C, possibly CINTCODE that would be for BPCL --- people can exercise their creativity in writing a custom byte-code VM or providing support for an existing byte-code VM.

You are telling me that my project is "absurdly pointless." That is a gross put-down.
I think that I have had enough of your put-downs. So, goodbye to you! I won't respond to you again.

Also, I have had enough of RISC-V enthusiasts telling me that I'm stupid to try doing anything of my own design, because it will just be pointless to everybody except myself. The RISC-V enthusiasts tell me that I should be smart like them and jump on the RISC-V bandwagon --- doing so will give me an instant 20-point I.Q. boost, getting me into the three-digit range finally.
The RISC-V enthusiasts believe that they have solved the problem of how to design a processor. They believe that any alternative effort can be proven to be absurdly pointless without being examined. Such certainty must make people feel good about themselves! I refer to this as the "conquistador mentality."


Top
 Profile  
Reply with quote  
PostPosted: Thu May 04, 2023 2:22 pm 
Offline

Joined: Mon Feb 15, 2021 2:11 am
Posts: 100
Yuri wrote:
Proxy wrote:
the resulting binary would likely take up a huge chunk of memory and be pretty slow... but it would work. you could then go back and start replacing some functions with smaller assembly equivalents to try and reduce the size.


Most modern compilers these days try to do all passes in one large go; which was not possible in the days of less memory. You'd likely be better off designing a compiler that has multiple stages that can pick up where the last one left off.

[/quote]

That's a good point. The C compiler from UNIX V7 in 1979 was still hosted on and targeting the PDP-11, and it had separate executables for each pass/phase. The overall cc compiler ran cpp for the preprocessor, c0 for the first pass (parsing), c1 for code generation, and c2 for optimization. One or more temporary files were used to store the output of each pass for use by the next pass.

If somebody wanted to port a C compiler for self-hosting on the 6502, I suspect the basic concept from the 1979 compiler would be a good idea. The PDP-11's memory per-process wasn't too far off from that of the 6502.

Personally, I'm looking at porting chibicc for the '816 as a cross-compiler, initially, and then to self-host.


Top
 Profile  
Reply with quote  
PostPosted: Sat May 06, 2023 4:23 am 
Offline
User avatar

Joined: Thu May 28, 2009 9:46 pm
Posts: 8177
Location: Midwestern USA
Hugh Aguilar wrote:
Nobody writes large amounts of assembly code any more.

Is that a fact? :D

Attachment:
File comment: POC V1.3 Firmware
firmware_2_7_0.txt [679.98 KiB]
Downloaded 64 times

Quote:
Actually, you may not tell me that I have a 1970s mindset. That is just a put-down.

Don't be pugnacious, Hugh. It doesn’t come across well on this forum.

_________________
x86?  We ain't got no x86.  We don't NEED no stinking x86!


Top
 Profile  
Reply with quote  
PostPosted: Sat May 06, 2023 7:32 am 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10800
Location: England
In my reading of the thread, it was sark02 who first raised the temperature.

Ideally, of course, it doesn't matter how things started - we should all be trying to de-escalate.

But it's clear enough from my own posting difficulty how difficult it can be.


Top
 Profile  
Reply with quote  
PostPosted: Sat May 06, 2023 9:31 am 
Offline

Joined: Tue Nov 10, 2015 5:46 am
Posts: 215
Location: Kent, UK
Oh Ed, try to not be such a killjoy, eh?

Feel free to leave this thread if you object to the sport.


Top
 Profile  
Reply with quote  
PostPosted: Sat May 06, 2023 10:24 am 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10800
Location: England
Or indeed if you don’t want a productive technical discussion.


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 54 posts ]  Go to page Previous  1, 2, 3, 4

All times are UTC


Who is online

Users browsing this forum: Paganini and 10 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: