A new C compiler for the 6502 and derivatives

whartung · Post by **whartung** » Thu Dec 01, 2016 5:23 pm

I think you can have a C with, let's say, 6502 specializations. In the end, that C program should compile and "work" on any other machine. Similarly, I should be able to take a C program from any other machine and compile it on the 6502.

But, let's take, for example, a Sieve of Eratosthenes program.

If you take a generic off the shelf version, and run it on the 6502, it should work.

However, if you add in some #pragma (or whatever it is that C uses) to kick in special 6502 aware features, it should run "better".

Ideally, a "sufficiently smart" compiler would be able to intuit whatever the #pragma is telling it, but, hey, baby steps.

Consider, the discussion about static function calls using fixed parameter areas. On the one hand, you can "force" that by a #pragma. On the other, a compiler might be able to intuit that.

Similarly, you could #pragma zero_page some global variables. Things like that.

That is all valid C, just the other compilers will ignore the #pragmas.

In that sense, we shouldn't need a "different" C. We should be able to just use C, and "hint" it to work better on a 6502.

Tor · Post by **Tor** » Thu Dec 01, 2016 6:46 pm

Or you could re-use the ancient 'register' keyword to zero-page variables. The keyword is accepted by every C compiler, but as far as I know it has been ignored by compilers for ages now. A modern compiler needs to take care of its register allocations itself, a user can only mess it up.
As the 6502 doesn't have many regular registers, the 'register' keyword would be useless for its original purpose. But it could be used to direct variable storage to zero page (considered the 'real' 6502 registers by some, anyway. And there are so many of them. The compiler will need a number of them for its own purposes, depending on the implementation of course, but there will still be room for the user there).
And 'register int my_local_variable;' will be 100% portable of course, and may even be better for porting to other old systems with a compiler that actually uses the 'register' hint (for it's only a hint, if no registers could be found the variable was allocated normally, i.e. on the stack, for traditional C compilers).

Edit: Global variables are a different issue. So maybe yes, #pragma could be used for that. I was never found of #pragmas though. They gave me that 'non-portable' feeling.

Arlet · Post by **Arlet** » Thu Dec 01, 2016 6:49 pm

Quote:

Or you could re-use the ancient 'register' keyword to zero-page variables

Unfortunately, the register keyword is not allowed for static variables, which you might also want to allocate in zero page.

White Flame · Post by **White Flame** » Thu Dec 01, 2016 9:50 pm

One issue of efficiency that 6502 and other small CPU style benefits from is using the flags, particularly C and Z, as parameters or return values. This means that just using registers, you can easily pass three 8-bit values and one boolean value as parameters to a function. C doesn't expose the carry bit at all, nor do any other languages that I'm aware of. I'm not sure how you would add manual carry bit operations to C, and trying to intuit its use falls into "sufficiently advanced compiler" territory.

C really doesn't mesh well with 6502, so "optimum" output will always be a pipe dream. But I agree that it can at least be handled better than cc65 and the like, performance- and footprint-wise.

BigEd · Post by **BigEd** » Thu Dec 01, 2016 9:58 pm

I wonder if a compiler could deal with the carry bit as a one-bit register... not without its complications of course.

whartung · Post by **whartung** » Thu Dec 01, 2016 11:14 pm

White Flame wrote:

C really doesn't mesh well with 6502, so "optimum" output will always be a pipe dream. But I agree that it can at least be handled better than cc65 and the like, performance- and footprint-wise.

Well that brings up another detail. Why not just try and fix cc65. Even if you had to completely yank out the code generation and optimization layer, there's still a boatload of code to reuse vs starting from scratch.

Don't have to "fix" it, you can fork it. Even has a liberal license.

jamestn529 · Post by **jamestn529** » Fri Dec 02, 2016 5:50 am

Arlet wrote:

Quote:

Or you could re-use the ancient 'register' keyword to zero-page variables

Unfortunately, the register keyword is not allowed for static variables, which you might also want to allocate in zero page.

I've been looking at how the Kiel C51 compiler for 8051-family processors works with the 8-bit architecture, and it has some great ideas to carry over. It has keywords for placing variables in different address spaces—internal RAM, external RAM, or ROM. A __zeropage keyword would allow both locals and globals to be manually placed into memory. I also like how Kiel allows placing variables at specific locations, but I don't like its syntax. Here's an example for our theoretical compiler, for the C64's SID. First is the cc65 way of defining an IO port, then Kiel's syntax, then how I would prefer the syntax to be:

Code: Select all

#define sid_freq1 (*(uint16_t *)(0xD400))

uint16_t sid_freq1 _at_ 0xD400;

uint16_t sid_freq1 __at(0xD400);

To be completely honest, I don't know if either has much of an advantage over the first one.

whartung wrote:

In that sense, we shouldn't need a "different" C. We should be able to just use C, and "hint" it to work better on a 6502.

I agree exactly.

Tor wrote:

...I was never found of #pragmas though. They gave me that 'non-portable' feeling.

I don't like how pragmas don't work with the preprocessor (C99's _Pragma operator fixes this), and I don't like how they look like preprocessor statements, but they affect the program semantically. I would rather use __identifiers for giving functions/variables special attributes.

White Flame wrote:

One issue of efficiency that 6502 and other small CPU style benefits from is using the flags, particularly C and Z, as parameters or return values. This means that just using registers, you can easily pass three 8-bit values and one boolean value as parameters to a function.

The carry flag can be used to return a bool to immediately BCS/BCC on the return value of a function, but it wouldn't be much faster than returning it in A/X/Y/zero-page location and doing a CPA/X/Y #imm or a LDA on the result location.

EDIT: Actually, you can even just do a TAX/Y for only two cycles.

whartung wrote:

Well that brings up another detail. Why not just try and fix cc65. Even if you had to completely yank out the code generation and optimization layer, there's still a boatload of code to reuse vs starting from scratch.

Don't have to "fix" it, you can fork it. Even has a liberal license.

Well first, if you completely remove the code generation and optimization layers, all you have left is a lexer and parser—cc65 doesn't build an AST or anything like that. Writing a C lexer is very easy; the only hang-up is distinguishing between typedefs and identifiers. Writing the parser is not much more difficult. If we have to re-write more than 2/3rds of the compiler, why not write the entire thing?

Second, C is only a good language for building a compiler if you want the compiler to be self-hosting. I would prefer an ML-family language (functional, ADT's, pattern matching) to write the compiler in. OCaml is my first choice, followed by SML, then Rust or F#. Haskell sort of fits into that category, but its strictness about purity make it feel like more of an academic than a practical language. I don't know how popular and those languages are with other members of the 6502 community, though. I don't want to be the only person able to maintain the compiler.

White Flame · Post by **White Flame** » Fri Dec 02, 2016 8:17 pm

jamestn529 wrote:

Here's an example for our theoretical compiler, for the C64's SID. First is the cc65 way of defining an IO port, then Kiel's syntax, then how I would prefer the syntax to be:

IMO, the proper way to define the SID in C would is something like this:

Code: Select all

typedef struct {
  uint16 frequency;
  uint16 duty_cycle;
  uint8  control;
 ...
} SID_Voice;

typedef struct {
 sid_voice voices[3];
 uint16    filter_cutoff;
 ...
} SID_Chip;

SID_Chip* const sid = (SID_Chip*)0xd400;

You should also be able to define the individual bit-fields in C as well, though I don't quite recall the syntax. This is all off the cuff, and I haven't done C in years, so details may be off.

Quote:

Second, C is only a good language for building a compiler if you want the compiler to be self-hosting.

This cannot be overstated enough. C is for systems programming. It sucks for performing analyses and transformations, and sucks for dealing with lots of transient non-lexically scoped data like a compiler needs to.

whartung · Post by **whartung** » Sat Dec 03, 2016 12:53 am

jamestn529 wrote:

Well first, if you completely remove the code generation and optimization layers, all you have left is a lexer and parser—cc65 doesn't build an AST or anything like that. Writing a C lexer is very easy; the only hang-up is distinguishing between typedefs and identifiers. Writing the parser is not much more difficult. If we have to re-write more than 2/3rds of the compiler, why not write the entire thing?

Well, then you can write it to the rest of the CC65 tool chain. Lots of wheel there to not reinvent.

Quote:

Second, C is only a good language for building a compiler if you want the compiler to be self-hosting. I would prefer an ML-family language (functional, ADT's, pattern matching) to write the compiler in. OCaml is my first choice, followed by SML, then Rust or F#. Haskell sort of fits into that category, but its strictness about purity make it feel like more of an academic than a practical language. I don't know how popular and those languages are with other members of the 6502 community, though. I don't want to be the only person able to maintain the compiler.

Good luck with that. There are reasons why those languages don't have general popularity. Not that I'm disagreeing with your points, but the technical barriers to entry for these languages is pretty much zero today, yet, still, they never became very popular.

jamestn529 · Post by **jamestn529** » Sat Dec 03, 2016 2:40 am

whartung wrote:

Well, then you can write it to the rest of the CC65 tool chain. Lots of wheel there to not reinvent.

I don't see a problem with that. Existing CC65 users will greatly appreciate compatibility with the toolchain. I myself don't have experience working with the toolchain.

kc5tja · Post by **kc5tja** » Tue Dec 06, 2016 2:46 am

jamestn529 wrote:

Dr Jefyll wrote:

jamestn529 wrote:

I like your approach to the problem. Once the bytecode backend works, we can start working on a native-code backend. As for if the native code is generated from the bytecode, I'm not sure. It's probably best to generate native code from 3-address code in SSA form. Take this bytecode for example:

Code: Select all

    LOAD_ARG_B 1
    LOAD_ARG_B 2
    ADD_B
    LOAD_ARG_B 3

Assuming these are ~5 instructions each, internally, straight-up emitting the code from the interpreter will give you a sequence that's 20 instructions long, while my hand-optimized code is only 7:

Code: Select all

    ldy #1
    lda (SP),Y
    clc
    iny
    adc (SP),Y
    iny
    sta (SP),Y

That's because you're doing it wrong.

Stack code has arguments, you just don't see them because they're implied. To whit:

Code: Select all

    LOAD_ARG_B 1
    LOAD_ARG_B 2
    ADD_B
    LOAD_ARG_B 3

becomes:

Code: Select all

Stack0A = argb(1)
Stack0B = argb(2)
Stack1A = Stack0A + stack0B
Stack1B = argb(3)

Notice that instead of using a 1-dimensional SSA assignments, your targets are now two-dimensional. One dimension represents relative space on the stack, and the other represents which generation over time. Thus, we see instantly that Stack1A is a newer parameter than either Stack0A or Stack0B, despite it occupying the same physical spot in the stack (logically) as Stack0A.

This also permits the compiler to automatically recognize swaps, drops, nips, rots, and other stack permutations, and resolve stuff like SWAP DUP OVER NIP to a single stack permutation, instead of four. If that's even required at all. If the compiler tracks which CPU register and/or zero/direct-page location maps to a corresponding stack slot, you end up with stack permutation code being generated only at the boundaries of basic blocks, which is *exactly* where register-oriented compilers tend to produce their loads and stores anyway. Turns out, they're identical to each other in practice (something Phil Koopman stated back when he wrote the book titled "Stack Machines, a New Wave", but nobody believed him then, and SSA hadn't become popular or widely known enough to construct a proof).

Finally, I've built compilers which incrementally applied aggressive peephole optimization techniques to produce near optimal code for a RISC processor given RPN input. Since 6502 zero-page / 65816 direct page is essentially the same as a giant register file, it follows the same techniques can be applied here too.

Do not let the dogma of today's computer science cloud your understanding of the opportunities for stack-based architectures or bytecodes. WebAssembly, a relatively recent attempt at making a portable program representation that is CPU agnostic and high performance had just switched away from AST representation to stack-based notation, precisely because it was proven that the two are isomorphic to each other, and the latter is more compact and easier to write tooling for. (Edit: Also because it allows a procedure to return multiple values to its caller.)

jamestn529 · Post by **jamestn529** » Tue Dec 06, 2016 3:54 am

@kc5tja You're right, I didn't factor in optimization. With some pen-and-paper experimentation, I've noticed that most stack drops can be left out until the end of a basic block, but I didn't consider stack fiddling. Of course, a code generator for a high-level language shouldn't need to use Forth-like stack operators.

I would like feedback on how good of an idea this is: my idea of a good abstract machine would be a hybrid RISC/stack machine with n "registers" (zero-page locations) and a computation stack for anything that doesn't fit in the registers. Bytecode only uses the computation stack for calculations, but native code uses both (or even just the registers if possible). And then if too much of the computation stack is used, values are spilled to the parameter stack. Theoretically, if you set both the register and comp stack space to zero, you'd have a CC65 mode!

Is a stack machine a better representation for a compiler internally, though? I think an IR like LLVM's (basically a RISC with infinite registers) would work better for generating machine code: an operation, a destination, and one or more sources. Sources and destinations would include registers, the comp stack, absolute locations, etc.

handyandy · Post by **handyandy** » Sun Dec 25, 2016 7:53 pm

Merry Christmas to all,

Just a general reply to the topic.

After a scan of the topic I don't think I recalled any mention of Hyper C for the Apple II; David Wheeler did mention it in his article/webpage. Of interest may be that it supported byte-code (interpreted) or inline assembler macros for the byte code. The source code was available for the operating system (byte code interpreter, input/output, file interface) and I rewrote the interpreter for a 65802 I had installed. The original source was 6502 but could be re-written for 65c02 and other machines. Just putting it out there so as not to re-invent a wheel...

Cheers,
Andy

jamestn529 · Post by **jamestn529** » Sun Dec 25, 2016 9:30 pm

@handyandy

I haven't looking into Hyper C before. Some cursory searching on the internet has only yielded a few mentions as well as files that can only be opened, presumably, in an Apple II. If you have some documentation saved on your computer, I would be very grateful if you shared it with me. I can use as much information on 6502 C compilers as I can get.

handyandy · Post by **handyandy** » Mon Dec 26, 2016 2:14 pm

I found an archive of files here: http://mirrors.apple2.org.za/ftp.apple. ... c/hyper_c/
and the files can be read on a windoze machine with a utility called ciderpress: http://a2ciderpress.com/
Don't know if the source files are there but if there's interest I can put them up somewhere. There's no source for the
tools like the compiler, assembler, linker etc. but there is source for interpreter(s), i/o and file system interface for either
ProDOS or a proprietary disk operating system called CDOS for 5 1/4" floppies.

I downloaded the files from the archive and could open and read the DOX files with ciderpress. I transcribed a lot of the
files from the paper manual many moons ago.

Cheers,
Andy

A new C compiler for the 6502 and derivatives

Re: A new C compiler for the 6502 and derivatives

Re: A new C compiler for the 6502 and derivatives

Re: A new C compiler for the 6502 and derivatives

Re: A new C compiler for the 6502 and derivatives

Re: A new C compiler for the 6502 and derivatives

Re: A new C compiler for the 6502 and derivatives

Re: A new C compiler for the 6502 and derivatives

Re: A new C compiler for the 6502 and derivatives

Re: A new C compiler for the 6502 and derivatives

Re: A new C compiler for the 6502 and derivatives

Re: A new C compiler for the 6502 and derivatives

Re: A new C compiler for the 6502 and derivatives

Re: A new C compiler for the 6502 and derivatives

Re: A new C compiler for the 6502 and derivatives

Re: A new C compiler for the 6502 and derivatives