I apologise: I am very sensitive to anything which looks like demotivating comments. (It might be that if one asked 'is anyone interested' instead of 'what do people think' one will get different responses.)
About assembly programming: there are those who rather enjoy programming ARM. It has a familiarity if you come from the land of 6502. And we found the same with our simple regular OPC processors: it was even fairly easily to transliterate from 6502, and then optimise. See for example this post:
https://anycpu.org/forum/viewtopic.php?p=2654#p2654
YA326502 - Yet Another 32-bit 6502 (but a bit different!)
Re: YA326502 - Yet Another 32-bit 6502 (but a bit different!
I think ARM is a rather nice CPU to hand-code assembly for. However the systems it ends up in are usually not as amenable to homebrewing device drivers as the 65xx family. Maybe if someone made a hardware clone of the Archimedes…
- BigDumbDinosaur
- Posts: 9425
- Joined: 28 May 2009
- Location: Midwestern USA (JB Pritzker’s dystopia)
- Contact:
Re: YA326502 - Yet Another 32-bit 6502 (but a bit different!
rpiguy2 wrote:
Insofar as why take 6502 all the way to 32-bits and why not use ARM or AVR32 or some other established architecture I think the only answer, which is admittedly very subjective, is that a lot of people simply don't like programming in assembly on these platforms as much as they enjoy programming assembly on simpler processors. Granted if you are going for a 32-bit micro controller chances are you won't be working in assembly anyway, so the target audience would be very, very narrow indeed.
Had Bill Mensch followed through with the 65C832, we would have had an MPU that would have been even less like a 6502 than the 65C816, but still hobbled by an 8-bit data bus and the same addressing issues as the '816. It could be Mensch had a flash of insight telling him a 32-bit rendition of the 6502 would be straying too far from the basic premise of the 6502, which is its no-nonsense instruction set and straightforward bus design. Also worth considering is that when Motorola started on the design of the 68000 they essentially worked from a clean sheet of paper, rather than try to produce a 16- or 32-bit version of the 6800. Evidently their engineers had a similar flash of insight and realized the 68000 was going to be a whole new MPU, not a 6800 with some extra registers bolted on.
Given all that plus the likelihood of a 65C832 being a poor seller (it would have been competing with the likes of the more technically advanced ARM) was sufficient to dissuade Mensch from pursuing it.
————————————————————————
Edit: Misspelled "Hyde."
Last edited by BigDumbDinosaur on Sat Apr 04, 2020 8:23 am, edited 1 time in total.
x86? We ain't got no x86. We don't NEED no stinking x86!
- GARTHWILSON
- Forum Moderator
- Posts: 8773
- Joined: 30 Aug 2002
- Location: Southern California
- Contact:
Re: YA326502 - Yet Another 32-bit 6502 (but a bit different!
The head post alluded to different bus widths, meaning the OP was not saying it had to stay with an 8-bit data bus. Myself, I'd go for a 32-bit, non-multiplexed data bus, and totally ditch any '02 and '816 emulation modes. All registers would be 32-bit, including DP, DBR, PBR, and of course the stack pointer, all of which would just become offsets that still allowed access to the entire 4-gigaword address space. The exception would be the status register which might do fine with 16 bits. We've been through that before though, in the topic that turned into the 65Org32 topic, at viewtopic.php?f=1&t=1419 .
As I recall, there was a good possibility that Apple (or was it a different company?) would want the 65832, so it was designed; but Mensch wasn't going to put it into production without a large order and some promise that there'd be a market for it.
As I recall, there was a good possibility that Apple (or was it a different company?) would want the 65832, so it was designed; but Mensch wasn't going to put it into production without a large order and some promise that there'd be a market for it.
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?
Re: YA326502 - Yet Another 32-bit 6502 (but a bit different!
GARTHWILSON wrote:
The head post alluded to different bus widths, meaning the OP was not saying it had to stay with an 8-bit data bus. Myself, I'd go for a 32-bit, non-multiplexed data bus, and totally ditch any '02 and '816 emulation modes. All registers would be 32-bit, including DP, DBR, PBR, and of course the stack pointer, all of which would just become offsets that still allowed access to the entire 4-gigaword address space. The exception would be the status register which might do fine with 16 bits. We've been through that before though, in the topic that turned into the 65Org32 topic, at viewtopic.php?f=1&t=1419 .
As I recall, there was a good possibility that Apple (or was it a different company?) would want the 65832, so it was designed; but Mensch wasn't going to put it into production without a large order and some promise that there'd be a market for it.
As I recall, there was a good possibility that Apple (or was it a different company?) would want the 65832, so it was designed; but Mensch wasn't going to put it into production without a large order and some promise that there'd be a market for it.
Re: YA326502 - Yet Another 32-bit 6502 (but a bit different!
BigDumbDinosaur wrote:
Consider that the 65C816 in native mode is less a 6502 than one might think. In particular, the data bus is a bit of a kludge, since it is used to emit bits 16-23 of the address during Ø2 low, followed by data during Ø2 high. Also, a 16-bit load or store involves two address bus cycles paired with two data bus cycles, which can create timing "gotchas" a system built around the 6502 would never encounter.
Quote:
Furthermore, programming the '816 in native mode requires a different way of thinking due to the uniqueness of new instructions and their effects on the registers, differing behaviors with 16-bit registers vs. 8-bit registers, different hardware vectors, the concept of a movable zero page, etc. It's almost a Jekyll and Hide dichotomy when you think about it.
The biggest problem with 8-bit processors in general, the lack of address space, unfortunately is not amenable to any sort of easy solution; extending the address space seems inevitably to involve serious compromise both with pinouts and the software side. So yeah, the 68000 seems to have been the easiest way to go there, though I think even there there were other bits of complexity they added that could have been dropped to keep life simpler, make interfacing easier, and not slow down certain operations (particularly interrupts).
But this is why I like these discussions, and don't mind them coming back regularly. Tossing around the various ideas helps give me a sense of the tradeoffs and also helps me realize what's great about the 6502.
A thread over on retrocomputingforum.com inspired me to think a bit more about this, too, especially what one might to today to try to bring back some of the essential simplicity of the 6502 while handling modern program and data set sizes. In a series of posts on that thread I propose basically moving to a much, much larger word size (36-64 bits) with (compared to microcomputers through the '90s) quite massive memory sizes, but trying to keep the rest more or less as simple as the 8-bit world. That moves the hardware a ways away from the ease of building an 8-bit SBC from typical '80s parts, but could preserve a lot of the simplicity of the software side.
Curt J. Sampson - github.com/0cjs
Re: YA326502 - Yet Another 32-bit 6502 (but a bit different!
I've been working on a 6502 backend for LLVM, and I'm familiar with gcc as well, so I might be able to speak from a compiler perspective.
In very important senses, both the ARM and the AVR instruction set architectures, ARE 32-bit 6502s. ARM was heavily inspired by the 6502 architecture. The flags are extremely similar from 6502 to ARM.
Both gcc and LLVM assume broadly, but not absolutely, that they are targeting processors that have multiple registers within a register type. For example, ARM has 16 32-bit registers, all of which could be addressed equivalently in any instruction that accesses a register.
This does not mean that these compilers cannot target 6502 style registers; it just means that additional effort is required per instruction to target them.
Any compiler or OS maker will probably treat your 8-bit addressable pages as virtual register banks, in the ARM style, where one or more banks would be user mode and another bank would be privileged mode. The designers of the 6502 were well aware of the limited number of registers in their designs, and so they provided plenty of ways to do 8-bit addressing, both indirectly and directly.
Your stacks are too small for most embedded work. 4KB stacks were the absolute minimum by the early 1990s. If you have a sufficient amount of 8-bit addressable memory generally, there's no convincing reason to make your stacks addressable in a byte. Link-time optimization takes away stack pushes and pops, if you have sufficient register space. Stack space is one of those things that each project will have its own needs, so I suggest that you don't burn stack locations into hardware.
I know of zero embedded applications that are gated by the speed of string handling. You are optimizing the wrong thing. Far more useful would be a built in DMA engine to handle fast memory copies.
God protect all programmers from "interesting" hardware. We LOVE boring. We LOVE industry standard behavior.
Keep in mind that all C compiler makers will treat whatever 8-bit addressable pages you give us, as (virtual) registers. I suggest you think more carefully through the cases where we do. For example, it would be extremely useful to have an instruction that takes a four-byte value at an 8-bit location, treats it as a 32-bit pointer, and loads that 32-bit location into 8-bit memory. Please think through all the cases of 8 bit, 16 bit, and 32 bit indirect loads and stores, with your 8-bit addressable page as a virtual register file. You might even alias A, X, Y, SR, SP and PC to the first and/or last few bytes of those 8-bit addressable pages.
There has never been a mass-produced CPU that did not have a C compiler. Perhaps what you are trying to create is a purely artistic expression, rather than a CPU that might be used on someone else's project.
Again, I think you are optimizing the wrong thing. You have no control as a hardware designer how many tasks or threads that an application designer will want; so it might be better to focus on user vs. protected modes, or MMU versus no MMU, and permit software to make deeper choices.
OK, but you simply cannot run any of those modern operating systems you've listed without an MMU. I've researched this issue. All of them assume that an MMU exists. If you don't have one, we'll have to emulate it.
If you're adding features onto a 32-bit 6502 die, the single most important feature you'll want is a multiplier that operates on your 8-bit addressable memory. The longer the better. Almost all modern applications assume that 32-bit multiplies are fast. The 6502 takes forever to do even an 8-bit multiply. For bonus points, put several multipliers in parallel and have them all able to hit 8-bit addressable memory at once. Most modern embedded designs for machine learning depend on this. See in particular the dgemm() operation in BLAS, and think about how fast you can get that to be on your design.
When you are designing new hardware, you are designing something for a market that already has certain expectations about how hardware should be. It is almost always the case, that new hardware designs should follow software requirements solely; quirky or "opinionated" hardware often gets binned in history. Programmers like flat memory models. Operating systems like MMUs. Compilers like register banks. DSP applications like vector multipliers.
Don't start from hardware features. Think about applications, and then think about the hardware features that those applications need.
I hope some of this is helpful and inspires your designs.
In very important senses, both the ARM and the AVR instruction set architectures, ARE 32-bit 6502s. ARM was heavily inspired by the 6502 architecture. The flags are extremely similar from 6502 to ARM.
Both gcc and LLVM assume broadly, but not absolutely, that they are targeting processors that have multiple registers within a register type. For example, ARM has 16 32-bit registers, all of which could be addressed equivalently in any instruction that accesses a register.
This does not mean that these compilers cannot target 6502 style registers; it just means that additional effort is required per instruction to target them.
Any compiler or OS maker will probably treat your 8-bit addressable pages as virtual register banks, in the ARM style, where one or more banks would be user mode and another bank would be privileged mode. The designers of the 6502 were well aware of the limited number of registers in their designs, and so they provided plenty of ways to do 8-bit addressing, both indirectly and directly.
Your stacks are too small for most embedded work. 4KB stacks were the absolute minimum by the early 1990s. If you have a sufficient amount of 8-bit addressable memory generally, there's no convincing reason to make your stacks addressable in a byte. Link-time optimization takes away stack pushes and pops, if you have sufficient register space. Stack space is one of those things that each project will have its own needs, so I suggest that you don't burn stack locations into hardware.
Quote:
Byte addressing is very important to the embedded market and makes string handling easier.
Quote:
Why not have a single, flat 32-bit stack? First of all, this is boring.
Quote:
Having 4 “fast pages” makes implementing a C compiler much easier.
Quote:
I know folks love their registers, and their C-compilers, but it is not for me.
Quote:
Having 4 small stacks and 4 “fast pages” also makes “small multitasking” easy to implement, allowing you to run a couple of tasks concurrently without having to swap in and out of memory.
Quote:
Of course an MMU could be added later which could isolate stacks between kernel and user space programs. It could also remap the 4 stacks and "fast pages" anywhere in memory, but then they aren’t as fast anymore. I really dislike it when the 65XX starts to look too much like just another “large system” processor.
If you're adding features onto a 32-bit 6502 die, the single most important feature you'll want is a multiplier that operates on your 8-bit addressable memory. The longer the better. Almost all modern applications assume that 32-bit multiplies are fast. The 6502 takes forever to do even an 8-bit multiply. For bonus points, put several multipliers in parallel and have them all able to hit 8-bit addressable memory at once. Most modern embedded designs for machine learning depend on this. See in particular the dgemm() operation in BLAS, and think about how fast you can get that to be on your design.
When you are designing new hardware, you are designing something for a market that already has certain expectations about how hardware should be. It is almost always the case, that new hardware designs should follow software requirements solely; quirky or "opinionated" hardware often gets binned in history. Programmers like flat memory models. Operating systems like MMUs. Compilers like register banks. DSP applications like vector multipliers.
Don't start from hardware features. Think about applications, and then think about the hardware features that those applications need.
I hope some of this is helpful and inspires your designs.