int size for C compilers for the 65816

litwr · Post by **litwr** » Sun Jan 29, 2023 10:21 am

I am curious is there a C-compiler for the 65816 that may use 24-bit data? Some type for which sizeof(type)=3? Thank you.

Alarm Siren · Post by **Alarm Siren** » Sun Jan 29, 2023 12:05 pm

Not that I am personally aware of, but then I only know about two!

Can't really see any point myself. It would be permissiable under the standard for type plain 'int' (where 'short' would be 16-bit and 'long' would be 32-bit), but would be counter to most C programmers' expectations of the type. Also, the C standard states that "The type 'int' should be the integer type that the target processor is most efficiently working with" which is not true of 24-bit on 65816: for this target by that requirement it should be 16-bit.

size_t, however, would logically be 24-bit on the 65816 platform. That being said I would imagine its actually implemented as 32-bit in all practical C-compilers (so that it recycles the long int datatype, rather than inventing its own), with the high byte being ignored or zeroed after computations completed.

litwr · Post by **litwr** » Sun Jan 29, 2023 12:39 pm

IMHO C for the 65816 should have 2 pointer types: short (2 bytes) and full (3 bytes). How can I point a value in memory above 64 KB using a 2 byte pointer? I just need a 3-byte pointer for this. And if we have such a pointer type we may convert it to int or use directly as a 3-byte integer. IMHO it would be quite natural for the 65816 to have sizeof(long) = 3.
BTW it is a matter of a competition too. A modern C compiler for the ez80 supports 3-byte integer. So if the 65816 still misses this support it is rather sad.

BigDumbDinosaur · Post by **BigDumbDinosaur** » Sun Jan 29, 2023 8:30 pm

litwr wrote:

I am curious is there a C-compiler for the 65816 that may use 24-bit data? Some type for which sizeof(type)=3?

None of which I am aware. In any case, and as Alarm Siren says, what’s the point? 16 bits is the 816’s native data type, not 24.

Although the 65C816 generates 24 bit addresses (even in emulation mode), working with 24 bits is awkward. Manipulating a 24-bit pointer ultimately requires frequent use of REP and SEP to set the accumulator size according to which part of the address is being manipulated. At two cycles per REP or SEP, things can get slowed down in iterative processes, especially in multiplication and division. With the 816, addresses are most efficiently handled as 16-bit (“near”) or 32-bit (“far”) pointers, hence avoiding a lot of REPing and SEPing.

Quote:

A modern C compiler for the ez80 supports 3-byte integer.

Not relevant, in my opinion. Any C compiler is ultimately tailored to the architecture of the targeted MPU. The eZ80 has 24-bit registers (a strange mutation, if you ask me), which gives rise to the odd sizeof(int). That is totally non-standard in the C environment. Any C program written for the eZ80 that exploits sizeof(int) == 3 would be inherently non-portable.

BTW, it’s amusing to see the warning “DO NOT USE IN LIFE SUPPORT” on the facing page of the eZ80 manual. Once of the specific features of both the 65C02 and 65C816 is suitability for implanted medical devices, such as cardiac pacers.

Alarm Siren · Post by **Alarm Siren** » Sun Jan 29, 2023 11:09 pm

litwr wrote:

IMHO C for the 65816 should have 2 pointer types: short (2 bytes) and full (3 bytes).

I considered this whilst drafting my previous reply, and I concluded that its not in the spirit of what C is supposed to accomplish.

The whole point of C as high-level language is to be a "portable assembler", that is, it should provide the programmer almost the same power and speed as writing in assembly but, provided that the programmer strictly conforms to the C standard and doesn't rely on any non-standard platform-specific behaviours, any program they write should be inherently portable between architectures. As soon as you start introducing two different sizes of pointer to be used in different contexts, you've broken that paradigm. Instead, the programmer just uses size_t which must, by definition, be large enough to contain the full address space, so in our case at least 24-bit. If the compiler is able to optimise down to 16-bit in certain circumstances that is fine and permissiable, but that is an architecture specific detail that should not be visible to the programmer when they are programming in C.

Obviously I am aware that many programs are not actually portable between platforms/architectures, but that is down to either the libraries they must hook into for that platform (which do not exist or are different on other platforms), or the programmer intentionally choosing to rely on implementation-specific behaviour that is not sanctioned by the C standard.

litwr wrote:

How can I point a value in memory above 64 KB using a 2 byte pointer?

As per above, if I were implementing a C compiler for the 65816 I would make size_t a 32-bit value so that it can fit the full address, but also be more compatible with other platforms and existing C programmers' expectations, and then have the uppermost 8 bits discarded when converting to a data access. However, understanding that 32-bit maths is slower I would optimise down to 16-bit "internal" pointers whenever possible. Infact, this is something that actually occured in the world of the early IBM/DOS PC: you had different "memory models" depending on how much memory your program needed to access, the tradeoff being that giving you access to more space used larger pointers that took longer to process. For details on this Microsoft's Raymond Chen wrote an interesting blog post on the subject, headaches and legacy of these memory models. One of the key things that these memory models did was redefine size_t in a non-portable manner - it wasn't always even necessarily the same throghout the program - in order to achieve better performance.

litwr wrote:

So if the 65816 still misses this support it is rather sad.

I agree it is sad that no such thing exists for the 65816, but only because I would enjoy the novelty. I believe the reason it doesn't exist is because it is not practical.

Martin_H · Post by **Martin_H** » Mon Jan 30, 2023 1:45 am

Seeing a discussion of near and far pointers in C brought back unhappy memories of MS-DOS and 16 bit Windows programming in the early 90's. Large programs were constructed using overlays, and as your program grew, you would have to rework the overlay arrangement.

The whole thing was brittle house of cards, but it paid the bills for five years.

litwr · Post by **litwr** » Mon Jan 30, 2023 10:18 am

Martin_H wrote:

Seeing a discussion of near and far pointers in C brought back unhappy memories of MS-DOS and 16 bit Windows programming in the early 90's. Large programs were constructed using overlays, and as your program grew, you would have to rework the overlay arrangement.

The whole thing was brittle house of cards, but it paid the bills for five years.

This is a necessity for the 16-bit mode of the x86. So, IMHO, "near" (2-byte) and "far" (3/4-byte) pointers are a necessity for the 65816. IMHO the main virtue of C is portability that, for me, means that I can optimally use all hardware specific features of target platforms. The portability of C-code is, IMHO, less important. Code for controllers is usually too specific to the hardware and cannot be easily ported due to the inherent nature of such code.
Operations with 3-byte values are clumsy on the 65816 but IMHO it is better than nothing anyway. Maybe ideally we need an option for 32-bit faster pointers and 24-bit smaller pointers.
EDIT. I also doubt that 3-byte pointers are slower because the actual use of 4-byte pointers needs to trim the highest byte and it is a real overhead.

Alarm Siren · Post by **Alarm Siren** » Mon Jan 30, 2023 10:51 am

litwr wrote:

So, IMHO, "near" (2-byte) and "far" (3/4-byte) pointers are a necessity for the 65816.

I absolutely agree, but I also think that is a detail that should be left to the compiler's optimiser rather than something the programmer should be fiddling with. If the programmer is fiddling with such things then you've broken the main design goal of the C programming language: you might as well be programming in Assembly. Certainly you can argue that you don't care about the portability of the code, and that's fair, but to my mind if you are ignoring that aspect then its not really C anymore, its just souped-up assembly with lots of curly brackets.

litwr wrote:

Maybe ideally we need an option for 32-bit faster pointers and 24-bit smaller pointers.

That seems reasonable, could be done easily enough with a compiler flag. So long as your C code isn't relying on the specific width of size_t (which it really shouldn't be), this option wouldn't break portability of the code and gives the programmer a choice on the speed/memory footprint conundrum.

litwr wrote:

I also doubt that 3-byte pointers are slower because the actual use of 4-byte pointers needs to trim the highest byte and it is a real overhead.

Perhaps I am mis-remembering, but I don't think you need to specifically trim the high byte on the 65816. I think you can just transfer A into the Bank register; because the Bank register is only 8-bits the upper half gets chopped off in the process, regardless whether you're in 16-bit or 8-bit mode.

kernelthread · Post by **kernelthread** » Mon Jan 30, 2023 11:52 am

If you use a 32 bit pointer, the processor doesn't access the top byte when dereferencing, so it doesn't matter what the value is. When doing arithmetic you can do 32 bit arithmetic (2 ADC or SBC instructions with M=0). Where it might be an issue is if you need to compare pointers - two pointers which differ only in the top byte are equal in the sense that they point to the same thing.
The problem with 3 byte pointers is you have to keep switching the accumulator between 8 and 16 bit mode, which takes 2 bytes of code and 3 clock cycles for each change. I have to say that is one of the irritating things about the 816. Other processors allow you to choose on a per-instruction basis whether they operate on 8 or 16 bits. Many (quite possibly most) of the bugs in my 816 code have been caused by the M or X flags being in the wrong state.

Alarm Siren · Post by **Alarm Siren** » Mon Jan 30, 2023 12:16 pm

kernelthread wrote:

If you use a 32 bit pointer, the processor doesn't access the top byte when dereferencing, so it doesn't matter what the value is. When doing arithmetic you can do 32 bit arithmetic (2 ADC or SBC instructions with M=0). Where it might be an issue is if you need to compare pointers - two pointers which differ only in the top byte are equal in the sense that they point to the same thing.
The problem with 3 byte pointers is you have to keep switching the accumulator between 8 and 16 bit mode, which takes 2 bytes of code and 3 clock cycles for each change. I have to say that is one of the irritating things about the 816. Other processors allow you to choose on a per-instruction basis whether they operate on 8 or 16 bits. Many (quite possibly most) of the bugs in my 816 code have been caused by the M or X flags being in the wrong state.

Ah yes, I had not considered comparisons. That is a fair point, though in C comparisons between pointers are only valid when the pointers are part of the same data structure, which I think mitigates that a bit (yes you can normally do such things, but officially that's undefined behaviour). In any case, how often does one really need to do direct pointer equality checks? I think masking off a byte or changing modes for that corner case is a small inefficiency versus making the much more common actions more straightforward.

litwr · Post by **litwr** » Mon Jan 30, 2023 2:17 pm

Alarm Siren wrote:

Perhaps I am mis-remembering, but I don't think you need to specifically trim the high byte on the 65816. I think you can just transfer A into the Bank register; because the Bank register is only 8-bits the upper half gets chopped off in the process, regardless whether you're in 16-bit or 8-bit mode.

For some modes it should work but for absolute long and direct page indirect long and their indexing variants it doesn't and we need the trimming for them.

Alarm Siren wrote:

Ah yes, I had not considered comparisons. That is a fair point, though in C comparisons between pointers are only valid when the pointers are part of the same data structure, which I think mitigates that a bit (yes you can normally do such things, but officially that's undefined behaviour). In any case, how often does one really need to do direct pointer equality checks? I think masking off a byte or changing modes for that corner case is a small inefficiency versus making the much more common actions more straightforward.

IMHO, you may always compare pointers converting them into integers and != and == ops should work fine.

Alarm Siren · Post by **Alarm Siren** » Mon Jan 30, 2023 3:08 pm

litwr wrote:

For some modes it should work but for absolute long and direct page indirect long and their indexing variants it doesn't and we need the trimming for them.

Fair enough. I didn't look into it in great detail, so I'm sure you're right. But really it seems to me no solution is perfect, they've all got drawbacks.

litwr wrote:

IMHO, you may always compare pointers converting them into integers and != and == ops should work fine.

I concede that on the vast majority of platforms it will indeed work just fine, and you're welcome to hold that opinion, but it is in direct contradiction of the stated intent and actual standards for the C programming language.

They intentionally state that comparisons between pointers that aren't part of the same data structure are undefined behaviour because on some platforms there are multiple, independent memory spaces.

Some examples:

On x86 you've got the main memory space, but also the IO Port space which could be thought of as an entirely separate memory space. Trying to compare a pointer from one into the other is meaningless. There was even a feature on the original 8086 that could, in theory, have allowed a separate memory space for the Stack, too.
A lot of AVR chips effectively have three memory spaces, in that they have EEPROM, Flash and RAM. All of these memory spaces are totally separate, and require different instructions to access.

If someone wanted to make a C compiler that guarantees that pointer comparisons between unrelated data structures will return meaningful results on the 65816 architecture, they can; indeed this would be the logical behaviour since there aren't multiple memory spaces to worry about, but that is over and above what the C standard itself requires. In practice many compilers do, in fact, make this guarantee: largely because in modern times most processor architectures don't have these multiple memory spaces or, if they do, the compiler can abstract it away or hide it behind function calls.

My point is simply that this is usually an assumption on the part of the programmer that it should work this way, and it often does, but its not actually a guarantee of the standard and by relying on it your programs become inherently fragile. Even if you're not interested in portability to different platforms, a strictly conforming C program guarantees it will behave as you expect across different compilers and versions of the same compiler; whereas if you rely on undefined or implementation-defined behaviour then you are at the mercy of your compiler vendor. Remember that "undefined behaviour" in the standard doesn't mean "varies by platform but is always predictable", it means "could literally do anything or nothing, including erasing your entire hard drive."

An interesting and practical quirk of using Undefined Behaviour is in optimisers - I haven't got the link handy, but I read an article a while ago about some code that worked perfectly when compiled without optimisation, but failed when optimised. I'm afraid I cannot remember the exact specifics, but essentially it turned out that the optimiser was making the assumption that the programmer would not intentionally rely on undefined behaviour, therefore there was no reason to check for it, so it optimised a comparison out entirely and always returned the same result.

Anyway, I'm going to stop arguing the toss on this, clearly we have different opinions and its not a hill worth dying on. Have a good day

BigDumbDinosaur · Post by **BigDumbDinosaur** » Mon Jan 30, 2023 8:34 pm

Alarm Siren wrote:

litwr wrote:

I also doubt that 3-byte pointers are slower because the actual use of 4-byte pointers needs to trim the highest byte and it is a real overhead.

Perhaps I am mis-remembering, but I don't think you need to specifically trim the high byte on the 65816. I think you can just transfer A into the Bank register; because the Bank register is only 8-bits the upper half gets chopped off in the process, regardless whether you're in 16-bit or 8-bit mode.

There is no direct means by which the accumulator can be copied to DB (data bank register)—you have to go through the stack. If the accumulator is set to 16 bits, a push will write the .B register first, then .A. The fly in the ointment is doing one pull with DB (PLB) would load it with what was in .A but would then leave the stack unbalanced. A dummy pull to get rid of the copy of .B would be needed fix up the stack, but you can’t do that with another PLB, as DB would be corrupted.

Due to the clumsy means by which DB gets accessed, the register seldom gets touched, other than possibly setting it to the program’s execution bank via a PHK - PLB sequence. DB only affects absolute memory accesses in which the target address is treated as a 16-bit value, e.g., LDA $8000,X, BIT $4000, LDA ($03,S),Y or LDA (<dp>) (DB has no effect on direct page or stack accesses, and is ignored with absolute long addressing). This behavior is useful when fetches/stores are confined to a single bank; such instructions will be faster than ones specifying a full 24-bit address. However, constantly manipulating DB to access more than 64KB of memory is slow, always clobbers at least one other register and requires that register be set to 8 bits. Use of DB to access wide swaths of memory is best avoided.

A much more efficient way to access memory with the 65C816 is via indirect pointers on direct page. Such addressing can take one of two forms: “bank-aware” or “bank-agnostic.” The bank-aware form uses the familiar (<dp>), (<dp>,X) and (<dp>),Y addressing modes to generate bits 0-15 of the effective address. DB generates bits 16-23. <dp> is a 16-bit, little-endian pointer on direct page and is referred to as a “near” pointer, since access is confined to the bank in DB.

The bank-agnostic form uses the [<dp>] and [<dp>],Y addressing modes, in which <dp> is a 24-bit, little-endian, direct-page “far” pointer. These addressing modes treat memory as linear space.

As 24-bit arithmetic is awkward, I’ve found it more efficient to use 32-bit far pointers—the 816 will simply ignore <dp>+$03. Doing so allows the use of basic addition and subtraction to advance or retreat a pointer, since the pointer is comprised of two words, not a word and a byte.

BigDumbDinosaur · Post by **BigDumbDinosaur** » Mon Jan 30, 2023 9:17 pm

litwr wrote:

EDIT. I also doubt that 3-byte pointers are slower because the actual use of 4-byte pointers needs to trim the highest byte and it is a real overhead.

Have you done any native-mode 65C816 assembly language programming? Your comment suggests to me you have not. Otherwise, you’d know an instruction such as LDA [SPTR],Y, in which SPTR is a 32-bit pointer, would result in the 65C816 “seeing” SPTR as 24-bits. not 32. In other words, the 816 would ignore (“trim”) SPTR+$03. This characteristic, coupled with the ease at which 32-bit pointer arithmetic can be done with the 816, is why I recommend “far” pointers be 32 bits instead of 24.

Quote:

For some modes it should work but for absolute long and direct page indirect long and their indexing variants it doesn't and we need the trimming for them.

Wrong. As I said above, the 816 doesn’t care whether a long direct page pointer is 24 or 32 bits.

As for the absolute-long addressing mode, it’s actually almost of no value in writing a C compiler. Use of absolute addressing of any kind implies the compiler knows in advance where data structures will be located. That would be the case with literal string constants and variables declared as static. It won’t work with anything that uses memory in a dynamic way, such as by calling malloc() to produce space for a piece of data.

My examination of the code emitted by WDC’s C compiler revealed that string constants and static data types are stored at the end of the program text, i.e., in the .data segment, which is a common tactic with C compilers. Similarly, .bss is in the memory space that follows .data.

Such an arrangement means run-time access to static constants and internally-generated data can be accomplished with “near” addressing. I’ve also noted that in the program initialization code, the instruction sequence PHK - PLB is present. This would set DB to the execution bank, thus allowing the use of absolute, direct page indirect and stack relative indirect addressing to access .data and .bss without having to manipulate DB or use an absolute-long address.

As an aside, I’ve yet to find a use for absolute-long addressing, and I’ve written many tens of thousands of lines of 816 code.

litwr · Post by **litwr** » Tue Jan 31, 2023 7:52 am

Thanks for the interesting information. However I can't help but express my sadness that something similar to the ez80 for the 6502 world has not yet been created.

BigDumbDinosaur wrote:

Have you done any native-mode 65C816 assembly language programming? Your comment suggests to me you have not. Otherwise, you’d know an instruction such as LDA [SPTR],Y, in which SPTR is a 32-bit pointer, would result in the 65C816 “seeing” SPTR as 24-bits. not 32. In other words, the 816 would ignore (“trim”) SPTR+$03. This characteristic, coupled with the ease at which 32-bit pointer arithmetic can be done with the 816, is why I recommend “far” pointers be 32 bits instead of 24.

I have written some code for the C64 SuperCPU and Apple IIgs but I can't call myself an 65816 expert. However it seems I can catch a tiny flaw in your logic. Let's work with an example

Code: Select all

zpptr1  byte 1,2,3
zpptr2  byte 4,5,6

We can use these pointers, for instance in instruction LDA [zpptr1]. I need to write exactly 3 bytes (not 4!) if I want to change the value of pointer zpptr1. I also hardly call the normal 65816 access to a 3-byte value as trimming.

int size for C compilers for the 65816

int size for C compilers for the 65816

Re: int size for C compilers for the 65816

Re: int size for C compilers for the 65816

Re: int size for C compilers for the 65816

Re: int size for C compilers for the 65816

Re: int size for C compilers for the 65816

Re: int size for C compilers for the 65816

Re: int size for C compilers for the 65816

Re: int size for C compilers for the 65816

Re: int size for C compilers for the 65816

Re: int size for C compilers for the 65816

Re: int size for C compilers for the 65816

Re: int size for C compilers for the 65816

Re: int size for C compilers for the 65816

Re: int size for C compilers for the 65816