It's in figuring out what the best optimization is when various functions are glued together where compilers really shine, of course. Sure, a good assembly language programmer will be able to do this fairly quickly and sometimes better than a compiler for any individual circumstance, but doing it everywhere, and redoing it when the code changes, is a lot of work. Tweaking the architecture to make it easier for a compiler to do its thing this way thus makes a lot of sense. (Though I'm not saying here it's not possible that keeping the zero page to treat as a big bank of pseudo-registers might not work as well or better.)
A Hypothetical C Friendly 6502
Re: A Hypothetical C Friendly 6502
Snial wrote:
With the y reg optimisation...about 27% faster.
It's in figuring out what the best optimization is when various functions are glued together where compilers really shine, of course. Sure, a good assembly language programmer will be able to do this fairly quickly and sometimes better than a compiler for any individual circumstance, but doing it everywhere, and redoing it when the code changes, is a lot of work. Tweaking the architecture to make it easier for a compiler to do its thing this way thus makes a lot of sense. (Though I'm not saying here it's not possible that keeping the zero page to treat as a big bank of pseudo-registers might not work as well or better.)
Last edited by cjs on Fri Dec 04, 2020 8:12 am, edited 1 time in total.
Curt J. Sampson - github.com/0cjs
Re: A Hypothetical C Friendly 6502
cjs wrote:
Snial wrote:
With the y reg optimisation...about 27% faster.
with every call
http://oneweekwonder.blogspot.com/2014/ ... ities.html
Quote:
It's in figuring out what the best optimization is when various functions are glued together where compilers really shine, of course.
Quote:
(Though I'm not saying here it's not possible that keeping the zero page to treat as a big bank of pseudo-registers might not work as well or better.)
e.g. <OpCode:4><DestReg:4><Src1Reg:4><Src2Reg:4>
Results in a maximum processing bandwidth of (3*RegisterWidth)/16 bits per instruction, whereas an instruction format of the form:
<OpCode:8><Operand:8>
Results in a maximum processing bandwidth of RegisterWidth/16 bits per instruction, up to 3 x less efficient, assuming that the 16-bits for both instructions are read in the same period.
The trade-off is that the xxxReg:4 fields capture a smaller proportion of operations than an Operand:8 field, so xxxReg:4 fields require more spilling and filling, but because there are diminishing returns on the proportion of operations captured as the operand field increases, there's a point at which the multiple smaller fields are more beneficial than a larger operand field. That trade-off comes at about 2 to 3 bits. So, 2x or 3x 4 bit operand fields are more efficient than a 1x8-bit operand field. Incidentally, this means that a variant of the 6502 which replaces the direct page with 2x 4-bit register fields ought to be marginally more efficient: <OpCode:8><DstReg:4><SrcReg:4> (2*8/16 bits per instruction of processing).
At the same time, the <OpCode:8><Operand:8> is still 33% more efficient than <OpCode:8><Operand:16> and thus zero-page (or rather single-byte) operands provide a real performance advantage over architectures that don't support them.
Nevertheless, a compiler targeted at an architecture that relies on absolute zero page memory locations will still find itself spilling and filling zero page locations, because it is simulating a single-operand, register rich architecture: the zero-page is a software managed cache for the entire set of stack frames.
This is the advantage that an architecture with support for stack frame addressing has over an absolute zero-page architecture, because that kind of spilling and filling isn't needed. Instead the stack frame pointer is merely adjusted. Operands don't need to be copied out to make space for new operands (thus saving 50% of the effort when building up a new stack frame) and Operands don't need to be copied back in at the end of a function (thus saving 100% of the effort when returning to a previous stack frame).
Both the 65T2 and 65C816 support that kind of stack frame (using ADS #n or by changing the direct page register), which is a significant performance improvement for a compiler - and it also makes compiler writing easier. A human assembly code writer can mitigate this because they can perform a global analysis on how to map stack frames to zero-page, in much the same way a 'C' compiler does for 8-bit PIC architectures.
Re: A Hypothetical C Friendly 6502
cjs wrote:
Perhaps it's just me, but this is feeling really not at all like a 6502 any more. I know there's no bright line between what does and what doesn't have "6502-nature," but (to my mind, anyway) moving 32- and 64-bit arithmetic into the CPU and switching to memory-to-memory operations seems to be well on the far side of it, and also my guess is that such a CPU would be many times the cost to build of the original 6502.
Quote:
What you've described above is also clearly a separate topic from the 65T2, since it (as far as I can tell) has no hope of coming anywhere near complying with the "approximately same transistor budget as the original MOS 6502" constraint, but I would be interested in hearing any ideas you have that would comply with that constraint while improving things for you as a compiler writer.
Code: Select all
ldy #argumentlength ;; set y to the length of each operand
loop:
lda zp1, x ;; load a byte from the first operand
mathop zp2, x ;; do the math operation between the first and second operand; this is add, subtract, compare, or, and, etc.
sta zp3, x ;; store the result in the third operand
inx ;; go for next byte
dey ;; y decreases for each byte of length of each operand
bne loop ;; repeat until y equals zero, at which point we have processed each byte in the operandsAs usual, Woz was completely right, way back in 1977, when he wrote the following: "While writing Apple BASIC for a 6502 microprocessor I repeatedly encountered a variant of Murphy's Law. Briefly stated, any routine operating on 16 bit data will require at least twice the code that it should. Programs making extensive use of 16 bit pointers (such as compilers, editors and assemblers) are included in this category. In my case, even the addition of a few double byte instructions to the 6502 would have only slightly alleviated the problem. What I really needed was a hybrid of the MOS Technology 6502 and RCA 1800 architectures, a powerful 8 bit data handler complemented by an easy to use processor with an abundance of 16 bit registers and excellent pointer capability." Personally I feel the 65T2 solution acknowledges the first of Woz's problems, but it ignores the second. This is not a deal breaker by any means; I just feel that perhaps the concept is focused on optimizing the wrong thing.
Historically, the processor designer that took Woz's words most to heart, was Sophie Wilson, a key designer for the ARM microarchitecture. (Sophie was at the time known as Roger Wilson.) https://en.wikipedia.org/wiki/Sophie_Wilson . Sophie provided one of the first CPU cores with a big pile of registers. Wilson was exceptionally aware of the 6502's design; she spent a great deal of time at WDC before helping to design the ARM. The ARM1 ran about 25000 gates. As a side note, it might be an interesting exercise to see how many ideas from the original 1985 design, might be backported trivially to a 65xx. http://www.righto.com/2015/12/reverse-e ... or-of.html
Lastly, realize that this whole discussion sidesteps a critical question, which is exactly what constitutes efficiency or "friendliness." This conversation seems to assume that the principal kind of efficiency to be sought is in the number of cycles saved. In practice, for any medium-sized program, the 6502's 16-bit memory limit becomes incredibly important incredibly quickly. Especially given the 6502's limited memory, a compiler author needs to consider the speed optimization case as well as the size optimization case, when considering whether a feature is language-friendly or not.
Last edited by johnwbyrd on Sun Dec 06, 2020 10:45 pm, edited 1 time in total.
- BigDumbDinosaur
- Posts: 9425
- Joined: 28 May 2009
- Location: Midwestern USA (JB Pritzker’s dystopia)
- Contact:
Re: A Hypothetical C Friendly 6502
johnwbyrd wrote:
In practice, for any medium-sized program, the 6502's 16-bit memory limit becomes incredibly important incredibly quickly.
x86? We ain't got no x86. We don't NEED no stinking x86!
Re: A Hypothetical C Friendly 6502
I think once again BDD you've missed the purpose of the thread: this is your third time chipping in with the 816. Surely, after the first time, everyone following the thread will have got the point. The thread continues because it isn't about choosing an existing design, it's about making a different one. It's completely fine if you are not interested in such a question. It's not quite so fine for you to keep on as if no-one else has seen the light.
- GARTHWILSON
- Forum Moderator
- Posts: 8773
- Joined: 30 Aug 2002
- Location: Southern California
- Contact:
Re: A Hypothetical C Friendly 6502
Although anyone is free to explore and implement their own ideas, I think BDD's point is that it's always good to first consider what's already available and well thought out and works well; and the '816 fulfills pretty much all the desires mentioned above, and a lot more (except it definitely does not fit the low-transistor-count budget, but that shouldn't matter since it's already available,). So for a hypothetical C-friendly '02, rather than starting with the '02 and modifying it, why not start with the '816 and see if there would still be any benefit to modifying its design. I can think of a few things to put on my wish list; but again, it already pretty much fulfills all the desires mentioned above, as-is. There are so many things about it which are commonly misunderstood though, causing people to discount it.
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?
Re: A Hypothetical C Friendly 6502
GARTHWILSON wrote:
Although anyone is free to explore and implement their own ideas, I think BDD's point is that it's always good to first consider what's already available and well thought out and works well..
Yes, but we heard the '816 recommendation the first three times BDD posted it. I've already added addressing support for it into llvm-mos. https://github.com/johnwbyrd/llvm-mos/w ... r-assembly
The 65816 is not a C-friendly 6502, for the following important reasons:
1. The 65816 still has the 6502's problems with paucity of registers. Yet it uses about as many transistors as the ARM1, a contemporary of the 65816. The ARM1 has 25 physical 32-bit registers. Now that is actually is a C-friendly design. All modern compilers practically, if not absolutely, assume that the target is register-rich. The only modern compilers that do not make this assumption, are either compilers for virtual machines, or toys.
2. The 65816 uses a bank-switched memory model, which C compilers have to take a lot of trouble to dance around. There is no support in ANSI C for pointers of different sizes, nor has there ever been. All the near/far pointer business in the 1980s and 1990s, were attempts to work around these limitations; but none of that was ever standardized across compilers. The ARM1 however has a flat 32-bit memory model... much easier to write a compiler for. All ARM1 pointers expand to 32 bits, regardless of how they are encoded in any instruction.
3. The 65816 likes 24-bit pointers, which no modern compiler supports natively... All modern compilers assume that pointers are a power of two in length.
In short, the 65816 design sacrificed too much in the name of backwards compatibility, when bus sizes were growing quickly beyond 16 and 24 bits.
Please don't lecture us any more about how C-friendly the 65816 is, without considering these points in detail.
Last edited by johnwbyrd on Mon Dec 14, 2020 9:23 pm, edited 13 times in total.
Re: A Hypothetical C Friendly 6502
GARTHWILSON wrote:
...and the '816 fulfills pretty much all the desires mentioned above, and a lot more (except it definitely does not fit the low-transistor-count budget, but that shouldn't matter since it's already available.
Quote:
So for a hypothetical C-friendly '02, rather than starting with the '02 and modifying it, why not start with the '816 and see if there would still be any benefit to modifying its design.
Curt J. Sampson - github.com/0cjs
- GARTHWILSON
- Forum Moderator
- Posts: 8773
- Joined: 30 Aug 2002
- Location: Southern California
- Contact:
Re: A Hypothetical C Friendly 6502
johnwbyrd wrote:
GARTHWILSON wrote:
Although anyone is free to explore and implement their own ideas, I think BDD's point is that it's always good to first consider what's already available and well thought out and works well.
I suppose the reason BDD posted repeatedly is that there were no real responses. If part of the recommendation seems invalid, then it would be good to respond as to why, to get some more discussion, and everyone learns, regardless of the outcome. load81 is the original poster, and I had to review where he started. I find that his goals are easily met in the '816. Even the transistor-count goal was not his. Interestingly, load81 has not posted again in this topic.
Quote:
The 65816 is not a C-friendly 6502. It still has the 6502's problems with paucity of registers; yet it uses about as many transistors as an ARM1, which actually is a C-friendly design.
Quote:
The 65816 uses a bank-switched memory model, which C compilers have to take a lot of trouble to dance around.
Quote:
Quote:
Could you say which aspects of the 65816 give you that advantage?
First for 6502:
Code: Select all
LDA (0,X)
PHA
INC 0,X
BNE fet1
INC 1,X
fet1: LDA (0,X)
JMP PUT
; and elsewhere, PUT which is used in so many places is:
PUT: STA 1,X
PLA
STA 0,XCode: Select all
LDA (0,X)
STA 0,X ; For the '816, PUT is only one 2-byte instruction anyway, so there's no sense in jumping to it.
Quote:
Further, the 65816 likes 24-bit pointers, which no modern compiler supports natively.
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?
Re: A Hypothetical C Friendly 6502
GARTHWILSON wrote:
I suppose the reason BDD posted repeatedly is that there were no real responses. If part of the recommendation seems invalid, then it would be good to respond as to why, to get some more discussion, and everyone learns, regardless of the outcome.
- In Snial's first post he mentions the 65816 approach and explains why he chose a different one: "The 65816 achieves it in part by adding another 16 or so additional addressing modes....The problem here though is that there's a chronic lack of instruction space for the extra addressing modes...."
- I replied right after BDD's first post on the '816 suggesting that it fails dramatically on one of Snial's critera (a criterion which is not inconsistent with the lead post). You replied immediately after that confirming that it totally blew off that criterion (10k transistors vs. 3.5k).
- BDD replied my post basically saying that from his perspective the transistor criterion was a secondary concern these days, which is clearly a personal opinion about his specific goals and doesn't address the goals of others. (Fair enough, but he's kind of coming across as if people are wrong in having other goals.) He also said that, "comparisons to 6502-family parts that are in current production and are in widespread use, both in commerce and hobby designs, are 100 percent appropriate," which is true, but Snial's post did explicitly make such a comparison. BDD's further claim that, "the 65C816 is a good fit to a C compiler" seems disputed at best; both Snial and johnwbyrd (who is actually building a modern C compiler for the 6502) disagree to at least some extent.
- In post 45, Snial said explicitly, "I'm aware of the 65C816 and it's very extensive addressing modes, including stack addressing modes and its DP register," noted that he has future plans to build an '816 SBC, and explained again why he's not following that route in this particular discussion.
- In yet another post I explain again that, "if you're going to go with "what could the original MOS 6502 have done differently without adding features that massively increase the cost," adding half a dozen registers is probably out." (The '816 adds the equivalent of eight 8-bit registers to the 6502 architecture.)
- Before BDD's third mention of the '816, Snial yet again here discusses both a clear difference between his approach and the '816 approach (dropping direct page adressing to focus on better stack-relative addressing) and mentions the '816 explicitly.
- After this BDD makes yet another post apparently suggesting that everybody who wants to think about any design other than the '816 is going terribly wrong; the entire contents of this post are, "That's why we have the 65C816."
If some people here are starting to feel like they're being told that the 65816 is the perfect solution to every problem, no other solutions should ever be considered, and any problems not solved by the '816 are not problems, well, you can see why.
Curt J. Sampson - github.com/0cjs
- GARTHWILSON
- Forum Moderator
- Posts: 8773
- Joined: 30 Aug 2002
- Location: Southern California
- Contact:
Re: A Hypothetical C Friendly 6502
I thought I had reviewed the entire topic before writing my last post; but it's hard to keep all the points in so many pages in mind. Everyone is welcome to post; but although I am in favor of keeping it all together as long as it's one subject, the fact remains that this is load81's topic, not snial's. I think snial is misunderstanding part of the '816, which is ok, but it's why we need to talk about it more. The transistor count is not one of load81's criteria; in fact, since some here are implying an emphasis on the cost of transistors back in 1975, I will point out that there is nothing in load81's post that suggests going back in time and doing the original 6502 design differently. IOW, transistor count is insignificant to load81's post. The fact that he says he would like it to be reasonably backwards compatible confirms that he's not talking about going back in time. He suggests four points. They are all met in the '816; so why does it keep getting discounted?
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?
Re: A Hypothetical C Friendly 6502
GARTHWILSON wrote:
I think snial is misunderstanding part of the '816, which is ok, but it's why we need to talk about it more.
Quote:
...I will point out that there is nothing in load81's post that suggests going back in time and doing the original 6502 design differently.
I agree there's a lot of room for interpretation as to what he means by "6502," and given how vague the starting goals and critera were, I have no problem with discussion of the 65816 as a solution to one particular instance of the problem. But it seems to me that any sort of hardline "If it's different from the 65816 we shouldn't discuss it in this thread," or anything even approaching that, is unwarranted.
Quote:
IOW, transistor count is insignificant to load81's post.
(If all critera are completely open, the actual answer to this question is, "Redesign the 6502 to be more or less a PDP-11." That's why individuals making proposals are adding their own critera to maintain whatever they think of as "6502-nature." Some people think it's still a 6502 even with a 16-bit accumulator, and some think it isn't.)
Curt J. Sampson - github.com/0cjs
- BigDumbDinosaur
- Posts: 9425
- Joined: 28 May 2009
- Location: Midwestern USA (JB Pritzker’s dystopia)
- Contact:
Re: A Hypothetical C Friendly 6502
cjs wrote:
If some people here are starting to feel like they're being told that the 65816 is the perfect solution to every problem, no other solutions should ever be considered, and any problems not solved by the '816 are not problems, well, you can see why.
I don't think anyone is promoting the idea that the 816 is the perfect solution to anything. There is no such a processor, so the question becomes one of how do you solve your computing problems with the imperfect processors that are available. It so happens that within the 6502 family, the 65C816 is the MPU that is best-equipped to be a C compiler target. Ergo of all the members of the 6502 family, the 816 is closest to being "perfect" to anyone who wishes to use 6502 hardware with a C compiler.
In passing, let's not forget this is a 6502 forum. I and some others are here because of the forum's normally narrow focus on 6502 hardware. The primary value of the forum is in passing around knowledge of how to build, modify, repair and program 6502 hardware. It goes without saying that there are bound to be "what if the 6502 had..." discussions—which I usually read end-to-end. I have my opinions and you have yours. My opinion is any design that substantially changes the architecture of the 6502 to where core features are crippled, eliminated or modified in a way that breaks compatibility with the genuine article falls outside of the 6502 realm.
In passing, I am reminded of something I learned years ago when I was in the military. The problems at hand are often solvable with what you have right at hand. In this case, what you have right at hand is the 65C816. Consider it first before you start conjuring a Frankenstein processor.
x86? We ain't got no x86. We don't NEED no stinking x86!
Re: A Hypothetical C Friendly 6502
BDD, you are addressing the wrong problem, in this thread. This thread is about designing CPUs, not about choosing them.
You'd do really well if you could try to respond to threads, not to posts. Each post has a context: each poster has a range of interests, each topic sets out some idea to be explored.
This is a 6502 forum, and Mike has very helpfully stepped in once or twice and clarified that this does include 6502-adjacent CPU designs, including new designs. It's abundantly clear what your interests are, and many people surely follow your POC threads with great interest. It's not clear that you've understood that other people's interests may differ from yours. Mike takes a much broader view than you do as to what's on topic here, and for that I am enormously grateful. There's room for breadth here: we have many subforums and any of us can start a thread at any time.
If your interests happen not to include the topic of some thread, that's a thread to leave alone - and that goes for all of us, of course.
You'd do really well if you could try to respond to threads, not to posts. Each post has a context: each poster has a range of interests, each topic sets out some idea to be explored.
This is a 6502 forum, and Mike has very helpfully stepped in once or twice and clarified that this does include 6502-adjacent CPU designs, including new designs. It's abundantly clear what your interests are, and many people surely follow your POC threads with great interest. It's not clear that you've understood that other people's interests may differ from yours. Mike takes a much broader view than you do as to what's on topic here, and for that I am enormously grateful. There's room for breadth here: we have many subforums and any of us can start a thread at any time.
If your interests happen not to include the topic of some thread, that's a thread to leave alone - and that goes for all of us, of course.
Re: A Hypothetical C Friendly 6502
BigDumbDinosaur wrote:
....so the question becomes one of how do you solve your computing problems with the imperfect processors that are available.
Quote:
...let's not forget this is a 6502 forum. I and some others are here because of the forum's normally narrow focus on 6502 hardware.
I've also been learning a lot about the 6502 itself from discussion of this 65T2 idea. It would be somewhat ironic if I learned less about the 6502 because you wanted to keep discussions limited to your idea of what constitutes appropriate 6502 discussion.
Curt J. Sampson - github.com/0cjs