C in a uC... what were they thinking??

White Flame · Post by **White Flame** » Mon Oct 09, 2017 8:21 pm

The only reason the utility in that video works is that there's no function parameters or local variables, though.

C++ and Rust are all about collapsing all decision making to compile-time, but they still assume a much more heavyweight processor ISA & ABI in order to deal with their scopes and language features.

whartung · Post by **whartung** » Tue Oct 10, 2017 11:07 pm

Not sure what "heavyweight" features are being used. I mean, clearly the 6502 is not compiler friendly, but it seems to me the primary reason for that is that the stack isn't as flexible as it is on other CPUs, and much of the overhead of conventional languages is done via parameter passing. But, frankly, I think that's the only serious limitation, a limitation addressed in the '816.

You could use any page in RAM along with X as an SP for a stack, but you're limited to 256 bytes (which is a lot if all you pass around is pointers, but not so much if you're passing around anything else -- C/C++ is pass by value).

Fig-Forth's stack is done via ZP, but if you're willing to eat the cycle costs you can place the stack on any page you want (LDA $200,X works just as well at LDA $0,X, just a touch more expensive.) Most high level stack frames are addressed absolutely.

X is the stack pointer, and decremented for a "push".

So (from Fig-Forth) 16 bit AND:

Code: Select all

          LDA 0,X
          AND 2,X
          LDA 1,X
          AND 3,X
          INX
          INX

That works just fine on another page:

Code: Select all

          LDA $200,X
          AND $202,X
          LDA $201,X
          AND $203,X
          INX    ; drop second argument off stack
          INX

If you had this code:

Code: Select all

int b;
void func(int16 a) {
    a=a+10;
    b = a;
}

Code: Select all

FUNC:   CLC   ;; Add Top of stack to constant
        LDA $200,X
        ADC #<10
        STA $200,X
        LDA $201,X
        ADC #>10
        STA $201,X
        LDA $200,X
        STA B
        LDA $201,X
        STA B+1
        INX    ;; leaving routine, drop parameter from stack
        INX
        RTS

Obviously compilers (especially modern compilers) like 16+ bit numbers, but that's a different problem. The compilers really don't care, and it's not a criteria of the language itself per se. With careful coding, you can keep C++ in the 8 Bit realm, you'll just be fighting the compiler a bit as it wants to promote to a larger word size. There's no reason you can't have a compliant C++ compiler that assumes smaller words sizes.

Dumping large stack frames on the 6502 is memory and time expensive. At some point several INX to dump the frame is more expensive than moving X to the accumulator and adding, and moving it back. It's up to the compiler to decide when that is (are you cycle conscious or code size conscious).

But that seems to be the only "heavyweight" operations that the '02 would need to be much more friendly to compilers and, like I said, the '816 fixed that problem. If you do a lot of 16b math, your 6502 code is going to bloat up, or call a lot of subroutines (C65 calls a lot of subroutines). But, '816 solves that issue to with 16b operations. '816 biggest problem is the 16 bit code size, but that's a linker issue.

White Flame · Post by **White Flame** » Wed Oct 11, 2017 10:05 am

C implementations also often have frame pointers which need both positive and negative addressing from them, in addition to the stack pointer.

The 6502 can't handle nested indirect addresses very well, either. It takes a lot of instructions to dereference another 16-bit pointer. Using (zp,x) means you can use a relative zp location, but have to increment the 16-bit value manually to get the high byte, while (zp,y) needs a fixed zp location for the vector or selfmod. Plus, the low byte would need to be held in temp while the high byte is fetched, before the pointer can be overwritten by its dereferenced value. Doing C++ style vtables would be really big and messy. Even loading a slot of a structure from its pointer takes a ton of instructions.

Arrays of structs/objects also employ multiplication for offsets. While a structure size that's known at compile time can reduce to a fixed set of shifts & adds, it's often impossible to determine at compile time whether or not the offset will remain within an 8-bit range, so that's another giant mess of code to calculate it in 16-bit. I believe it's legal for the implementation to round up & pad the sizes of structs for alignment, which would avoid the multiplication but then begins to waste memory.

whartung wrote:

With careful coding, you can keep C++ in the 8 Bit realm, you'll just be fighting the compiler a bit as it wants to promote to a larger word size. There's no reason you can't have a compliant C++ compiler that assumes smaller words sizes.

The minimum size for shorts & ints in C (and presumably C++) is 16 bits, so you need at least that to be compliant.

Of course, yes the 65816 solves many of these issues.

DerTrueForce · Post by **DerTrueForce** » Wed Oct 11, 2017 8:23 pm

I'm speaking from ignorance here, but aren't there some 8-bit types? I'm thinking of char primarily, but there may also be short short and byte(the first might be C# and the second might only be Arduino C)

Alarm Siren · Post by **Alarm Siren** » Wed Oct 11, 2017 9:07 pm

DerTrueForce wrote:

I'm speaking from ignorance here, but aren't there some 8-bit types? I'm thinking of char primarily, but there may also be short short and byte(the first might be C# and the second might only be Arduino C)

I havn't come across a Short Short in my travels, and I do a lot of Microsoft .NET programming, but it may well exist. Byte is indeed from the Arduino flavour of C/C++; it is infact a typedef that maps byte to char.

In C itself, the following must hold for the fundamental integer types:

long long must not have less bits than long, and must be a minimum of 64 bits.
long must not have less bits than int, and must be a minimum of 32 bits.
int must not have less bits than short, and must be a minimum of 16 bits.
short must not have less bits than char, and must be a minimum of 16 bits.
char must be a minimum of 8 bits.

Further, char should be the smallest individually addressable chunk of memory - what is normally called a byte in processor architecture jargon. Int should be the native word size of the architecture, which in most cases means the size of the general purpose registers.

The exact definition of how many bits each type has, within these constraints, is left to the implementation.

---

Obviously, it leaves our little 6502 a bit in the cold: int should be 8 bits - that being the native word size of the chip (and also the byte size) - but it cannot be as that is forbidden.

whartung · Post by **whartung** » Wed Oct 11, 2017 9:24 pm

Yea the problem isn't the data types, it has both signed and unsigned 8 bit types.

The game is struggling to keep things from expanding.

Code: Select all

    char c;
    c = c + 1;

The "c + 1" will expand out to a 16 bit calculation before it drops back down to an 8 bit result, because "1" is an int, and an (at least) 16 bit number.

On a larger processor, of course, it makes sense to get things to 16 or 32 bits and work there (as those are the native register sizes and such), but for an 8 bit processor, that's not the case.

Quote:

Obviously, it leaves our little 6502 a bit in the cold: int should be 8 bits - that being the native word size of the chip (and also the byte size) - but it cannot be as that is forbidden.

I don't agree with this, personally. int being 16 bits it's not the 6502's fault. Heck, even Forth is a 16 bit runtime. EVERYTHING is 16 bits in Forth (at least at some point in its life).

Since C has support for 8 bit types, the fact that int is not one of them doesn't bother me.

Arguably any portable code in C should be using explicitly sized types religiously and would never use int itself, as it's definition is implementation dependent. Given you list above, legally long, int, short, and char can all be 32 bits in a compliant C implementation.

Alarm Siren · Post by **Alarm Siren** » Wed Oct 11, 2017 10:00 pm

whartung wrote:

int being 16 bits it's not the 6502's fault

I didn't say it was. I said it was the C language that leaves the 6502 out in the cold, not the other way around.

whartung wrote:

Arguably any portable code in C should be using explicitly sized types religiously and would never use int itself

You're probably right, but doing so will massively bloat the code - your code from above would become

Code: Select all

uint8_t c;
c = (uint8_t)((uint8_t)c + (uint8_t)1);

Or, depending on the implementation then the following might have the desired effect:

Code: Select all

uint8_t c;
c = c + (uint8_t)1;

However, if int could be 8-bit, then specifying the type in this scenario becomes unecessary. Also, explicitly sized types are almost always implemented as #defines or typedefs, so at some level you're always using the fundamental types.

whartung wrote:

Given you list above, legally long, int, short, and char can all be 32 bits in a compliant C implementation.

Indeed, that would be a 100% legal implementation, assuming that 32-bits was the smallest individually addressable memory size and also the native word size. You could go even crazier and have everything, including long long, be 64-bits.

DerTrueForce · Post by **DerTrueForce** » Wed Oct 11, 2017 11:40 pm

Maybe we need a C-flat language. Like C, but actually geared towards 8-bit processors.
Since, y'know we can't be compliant with the C standards and be efficient at the same time on the 6502...

Alarm Siren · Post by **Alarm Siren** » Thu Oct 12, 2017 12:09 am

A minimalistic, C-like language specifically geared towards the 6502 and similar architectures....
Seems like it'd be a large project, but fun for those so inclined and useful to the rest.

bdk6 · Post by **bdk6** » Thu Oct 12, 2017 12:28 am

Two words:
1. #pragma
2. Action! https://en.wikipedia.org/wiki/Action!_( ... _language)

bdk6 · Post by **bdk6** » Thu Oct 12, 2017 1:02 am

commodorejohn wrote:

Druzyek wrote:

Right, that is a skill we should all try to develop. The problem is that that will never be enough for you to outperform an optimizing compiler on larger programs. On modern processors, it does all of those things better than you can.

People keep making this assertion like it's a religious mantra, but I'm not convinced. There's nothing magic about large programs or optimizing compilers, and the only thing that makes "modern processors" particularly complicated is out-of-order execution (hell, even back in the Pentium days clever programmers were optimizing for multiple-issue.) And sure, a program can probably work out the particulars of instruction ordering with less tedium than a human, but then the next generation of CPUs comes out and it's all different anyways...

Nothing magic indeed. At least nothing more magic than being able to calculate a 64000 point FFT in real time. And that's the key. Sure, the compiler only knows the "tricks" that the compiler writer puts into it. Quite likely some really good programmer for a particular processor knows more tricks than the compiler writer, but not many. But the compiler can essentially try every combination of every trick it knows to find a minimal (time or space) solution. In under a minute or two. Perhaps a good assembly language programmer could do that, too. But only for one non-trivial program in a lifetime.

When I said "modern processor" I was NOT talking about an x86. That's not a modern processor. It's a modern implementation of a 45 year old processor.
8008 -> 8080 -> 8086 -> 80286 -> 80386 _-> 80486 -> Pentium ...
Cache and out of order or speculative execution and all the rest are a different ballgame entirely. They fall into the compiler writer's responsibility, but they aren't the same issue. What I meant by modern processor was (in simple terms) anything architected since the RISC revolution of the early 80's. That doesn't necessarily mean a RISC processor. But it does mean it was influenced by RISC. Two of my favorite examples, at rather opposite ends of the scale, are AVR (Atmel/Microchip) and ARM. PIC, although they latched onto the RISC name, are far from modern or RISC in the traditional sense.

So, for anyone interested but unconvinced, I have two recommendations. First, pick a modern processor that was architected since about 1985 and write a non-trivial, real application for it: twice. Once in the language of your choice that has a really good compiler (GCC usually suffices) and once in assembly language. See if you can beat the compiler when full optimization is on. Take a look at the output assembly code. You will likely wonder where your program went. The compiler will probably do all sorts of things you never dreamed of and don't even recognize as your program. Embedded developers often turn the optimization down so they can debug because otherwise variables, blocks of code, and even entire functions disappear. Make sure you use the same types of data structures and algorithms in both programs or you are comparing apples to potatoes. Give it a try!

Second, pick up a good, modern compiler writing textbook and look through what types of things the compiler does to optimize. And keep in mind that will be an introductory course; professional compiler writers / companies will have lots of trade secrets they don't share. this https://www.amazon.com/Modern-Compiler- ... 052182060X is a good one but may be a little dated. The state of the art continues to progress at a fantastic rate.

To sum up, I firmly believe that a good compiler will beat almost every human for anything more than a small number of lines of code. There will always be some example the compiler can't do. But those things will be small, special purpose pieces, not entire programs.

"That's all I got to say 'bout that." -- Forrest Gump

Druzyek · Post by **Druzyek** » Thu Oct 12, 2017 2:09 am

Alarm Siren wrote:

A minimalistic, C-like language specifically geared towards the 6502 and similar architectures....
Seems like it'd be a large project, but fun for those so inclined and useful to the rest.

This sounds awesome! I would not be bothered at all if it did not meet the C standard. I was looking at the standard keywords and wondering what parts you could implement with minimal loss of speed or size.

Another idea would be to add some keywords to assembly language. I wonder how many of the compiler optimizations you could implement if you had a way to show the assembler which variables are volatile, which labels are functions vs labels, what the scope of certain variables are, etc

barrym95838 · Post by **barrym95838** » Thu Oct 12, 2017 2:34 am

https://github.com/dschmenk/PLASMA/

I haven't tried it yet, but it looks like the talented Mr. Schmenk is on to something nice here.

Mike B.

White Flame · Post by **White Flame** » Thu Oct 12, 2017 5:30 am

PLASMA is a VM, so if the point is to get high speed, which is usually why C is chosen over other HLLs, I don't think that's the direction to take. VMs are great for code density, though.

BigDumbDinosaur · Post by **BigDumbDinosaur** » Thu Oct 12, 2017 5:58 am

Klaus2m5 wrote:

Or to put it more simple: garbadge in = garbadge out holds true even for the most sophisticated compiler.

That has been true since the dawn of computing. The only thing that has changed is the garbage comes out so much faster with today's hardware.

C in a uC... what were they thinking??

Re: C in a uC... what were they thinking??

Re: C in a uC... what were they thinking??

Re: C in a uC... what were they thinking??

Re: C in a uC... what were they thinking??

Re: C in a uC... what were they thinking??

Re: C in a uC... what were they thinking??

Re: C in a uC... what were they thinking??

Re: C in a uC... what were they thinking??

Re: C in a uC... what were they thinking??

Re: C in a uC... what were they thinking??

Re: C in a uC... what were they thinking??

Re: C in a uC... what were they thinking??

Re: C in a uC... what were they thinking??

Re: C in a uC... what were they thinking??

Re: C in a uC... what were they thinking??