YA326502 - Yet Another 32-bit 6502 (but a bit different!)

Let's talk about anything related to the 6502 microprocessor.
User avatar
BigEd
Posts: 11463
Joined: 11 Dec 2008
Location: England
Contact:

Re: YA326502 - Yet Another 32-bit 6502 (but a bit different!

Post by BigEd »

I apologise: I am very sensitive to anything which looks like demotivating comments. (It might be that if one asked 'is anyone interested' instead of 'what do people think' one will get different responses.)

About assembly programming: there are those who rather enjoy programming ARM. It has a familiarity if you come from the land of 6502. And we found the same with our simple regular OPC processors: it was even fairly easily to transliterate from 6502, and then optimise. See for example this post:
https://anycpu.org/forum/viewtopic.php?p=2654#p2654
Chromatix
Posts: 1462
Joined: 21 May 2018

Re: YA326502 - Yet Another 32-bit 6502 (but a bit different!

Post by Chromatix »

I think ARM is a rather nice CPU to hand-code assembly for. However the systems it ends up in are usually not as amenable to homebrewing device drivers as the 65xx family. Maybe if someone made a hardware clone of the Archimedes…
User avatar
BigDumbDinosaur
Posts: 9425
Joined: 28 May 2009
Location: Midwestern USA (JB Pritzker’s dystopia)
Contact:

Re: YA326502 - Yet Another 32-bit 6502 (but a bit different!

Post by BigDumbDinosaur »

rpiguy2 wrote:
Insofar as why take 6502 all the way to 32-bits and why not use ARM or AVR32 or some other established architecture I think the only answer, which is admittedly very subjective, is that a lot of people simply don't like programming in assembly on these platforms as much as they enjoy programming assembly on simpler processors. Granted if you are going for a 32-bit micro controller chances are you won't be working in assembly anyway, so the target audience would be very, very narrow indeed.
I seem to recall at one time the point was brought up that a 32-bit 6502 wouldn't be a 6502 as we know it. Consider that the 65C816 in native mode is less a 6502 than one might think. In particular, the data bus is a bit of a kludge, since it is used to emit bits 16-23 of the address during Ø2 low, followed by data during Ø2 high. Also, a 16-bit load or store involves two address bus cycles paired with two data bus cycles, which can create timing "gotchas" a system built around the 6502 would never encounter. Furthermore, programming the '816 in native mode requires a different way of thinking due to the uniqueness of new instructions and their effects on the registers, differing behaviors with 16-bit registers vs. 8-bit registers, different hardware vectors, the concept of a movable zero page, etc. It's almost a Jekyll and Hyde dichotomy when you think about it.

Had Bill Mensch followed through with the 65C832, we would have had an MPU that would have been even less like a 6502 than the 65C816, but still hobbled by an 8-bit data bus and the same addressing issues as the '816. It could be Mensch had a flash of insight telling him a 32-bit rendition of the 6502 would be straying too far from the basic premise of the 6502, which is its no-nonsense instruction set and straightforward bus design. Also worth considering is that when Motorola started on the design of the 68000 they essentially worked from a clean sheet of paper, rather than try to produce a 16- or 32-bit version of the 6800. Evidently their engineers had a similar flash of insight and realized the 68000 was going to be a whole new MPU, not a 6800 with some extra registers bolted on.

Given all that plus the likelihood of a 65C832 being a poor seller (it would have been competing with the likes of the more technically advanced ARM) was sufficient to dissuade Mensch from pursuing it.

————————————————————————
Edit: Misspelled "Hyde."
Last edited by BigDumbDinosaur on Sat Apr 04, 2020 8:23 am, edited 1 time in total.
x86?  We ain't got no x86.  We don't NEED no stinking x86!
User avatar
GARTHWILSON
Forum Moderator
Posts: 8773
Joined: 30 Aug 2002
Location: Southern California
Contact:

Re: YA326502 - Yet Another 32-bit 6502 (but a bit different!

Post by GARTHWILSON »

The head post alluded to different bus widths, meaning the OP was not saying it had to stay with an 8-bit data bus. Myself, I'd go for a 32-bit, non-multiplexed data bus, and totally ditch any '02 and '816 emulation modes. All registers would be 32-bit, including DP, DBR, PBR, and of course the stack pointer, all of which would just become offsets that still allowed access to the entire 4-gigaword address space. The exception would be the status register which might do fine with 16 bits. We've been through that before though, in the topic that turned into the 65Org32 topic, at viewtopic.php?f=1&t=1419 .

As I recall, there was a good possibility that Apple (or was it a different company?) would want the 65832, so it was designed; but Mensch wasn't going to put it into production without a large order and some promise that there'd be a market for it.
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?
rpiguy2
Posts: 94
Joined: 06 Apr 2018

Re: YA326502 - Yet Another 32-bit 6502 (but a bit different!

Post by rpiguy2 »

GARTHWILSON wrote:
The head post alluded to different bus widths, meaning the OP was not saying it had to stay with an 8-bit data bus. Myself, I'd go for a 32-bit, non-multiplexed data bus, and totally ditch any '02 and '816 emulation modes. All registers would be 32-bit, including DP, DBR, PBR, and of course the stack pointer, all of which would just become offsets that still allowed access to the entire 4-gigaword address space. The exception would be the status register which might do fine with 16 bits. We've been through that before though, in the topic that turned into the 65Org32 topic, at viewtopic.php?f=1&t=1419 .

As I recall, there was a good possibility that Apple (or was it a different company?) would want the 65832, so it was designed; but Mensch wasn't going to put it into production without a large order and some promise that there'd be a market for it.
I really like the 65Org32 philosophy. I was a little surprised at the number of people implying "use ARM" or "can't compete with ARM" in this thread. The sentiment was greatly different here if you look back at threads from 2009-2013. Of course that was before ARM took over to the extent it has today, powering everyone's pocket computer on the planet. Time marches on I guess!
User avatar
cjs
Posts: 759
Joined: 01 Dec 2018
Location: Tokyo, Japan
Contact:

Re: YA326502 - Yet Another 32-bit 6502 (but a bit different!

Post by cjs »

BigDumbDinosaur wrote:
Consider that the 65C816 in native mode is less a 6502 than one might think. In particular, the data bus is a bit of a kludge, since it is used to emit bits 16-23 of the address during Ø2 low, followed by data during Ø2 high. Also, a 16-bit load or store involves two address bus cycles paired with two data bus cycles, which can create timing "gotchas" a system built around the 6502 would never encounter.
Yes. One of the great features of the 6502 (and most 8-bit CPUs, actually) is the simplicity of the hardware interface; it's quite easy to build up a small system (often single-board) with basic RAM, ROM and I/O, and interface a lot more I/O to it. The 65816 broke this by multiplexing the data bus and top 8 bits of the address bus, and still didn't really fix the "have to work in 64K chunks" problem.
Quote:
Furthermore, programming the '816 in native mode requires a different way of thinking due to the uniqueness of new instructions and their effects on the registers, differing behaviors with 16-bit registers vs. 8-bit registers, different hardware vectors, the concept of a movable zero page, etc. It's almost a Jekyll and Hide dichotomy when you think about it.
Some of that, though, is really more about unnecessary (except perhaps for cost) problems introduced into the 6502 design in the first place, rather than "8-bit" issues in general. The world would have been a nicer (if possibly more expensive) place had the 6502 gone with 16-bit index registers, like the 8080 and the 6800 before it. A movable direct page is also quite a simple thing, and works well in both the 6809 and the 65816 (as far as the latter goes; I don't know why they didn't allow it to be anywhere in RAM), though probably just adding another index register or two could do the job just as well.

The biggest problem with 8-bit processors in general, the lack of address space, unfortunately is not amenable to any sort of easy solution; extending the address space seems inevitably to involve serious compromise both with pinouts and the software side. So yeah, the 68000 seems to have been the easiest way to go there, though I think even there there were other bits of complexity they added that could have been dropped to keep life simpler, make interfacing easier, and not slow down certain operations (particularly interrupts).

But this is why I like these discussions, and don't mind them coming back regularly. Tossing around the various ideas helps give me a sense of the tradeoffs and also helps me realize what's great about the 6502.

A thread over on retrocomputingforum.com inspired me to think a bit more about this, too, especially what one might to today to try to bring back some of the essential simplicity of the 6502 while handling modern program and data set sizes. In a series of posts on that thread I propose basically moving to a much, much larger word size (36-64 bits) with (compared to microcomputers through the '90s) quite massive memory sizes, but trying to keep the rest more or less as simple as the 8-bit world. That moves the hardware a ways away from the ease of building an 8-bit SBC from typical '80s parts, but could preserve a lot of the simplicity of the software side.
Curt J. Sampson - github.com/0cjs
johnwbyrd
Posts: 89
Joined: 01 May 2017

Re: YA326502 - Yet Another 32-bit 6502 (but a bit different!

Post by johnwbyrd »

I've been working on a 6502 backend for LLVM, and I'm familiar with gcc as well, so I might be able to speak from a compiler perspective.

In very important senses, both the ARM and the AVR instruction set architectures, ARE 32-bit 6502s. ARM was heavily inspired by the 6502 architecture. The flags are extremely similar from 6502 to ARM.

Both gcc and LLVM assume broadly, but not absolutely, that they are targeting processors that have multiple registers within a register type. For example, ARM has 16 32-bit registers, all of which could be addressed equivalently in any instruction that accesses a register.

This does not mean that these compilers cannot target 6502 style registers; it just means that additional effort is required per instruction to target them.

Any compiler or OS maker will probably treat your 8-bit addressable pages as virtual register banks, in the ARM style, where one or more banks would be user mode and another bank would be privileged mode. The designers of the 6502 were well aware of the limited number of registers in their designs, and so they provided plenty of ways to do 8-bit addressing, both indirectly and directly.

Your stacks are too small for most embedded work. 4KB stacks were the absolute minimum by the early 1990s. If you have a sufficient amount of 8-bit addressable memory generally, there's no convincing reason to make your stacks addressable in a byte. Link-time optimization takes away stack pushes and pops, if you have sufficient register space. Stack space is one of those things that each project will have its own needs, so I suggest that you don't burn stack locations into hardware.
Quote:
Byte addressing is very important to the embedded market and makes string handling easier.
I know of zero embedded applications that are gated by the speed of string handling. You are optimizing the wrong thing. Far more useful would be a built in DMA engine to handle fast memory copies.
Quote:
Why not have a single, flat 32-bit stack? First of all, this is boring.
God protect all programmers from "interesting" hardware. We LOVE boring. We LOVE industry standard behavior.
Quote:
Having 4 “fast pages” makes implementing a C compiler much easier.
Keep in mind that all C compiler makers will treat whatever 8-bit addressable pages you give us, as (virtual) registers. I suggest you think more carefully through the cases where we do. For example, it would be extremely useful to have an instruction that takes a four-byte value at an 8-bit location, treats it as a 32-bit pointer, and loads that 32-bit location into 8-bit memory. Please think through all the cases of 8 bit, 16 bit, and 32 bit indirect loads and stores, with your 8-bit addressable page as a virtual register file. You might even alias A, X, Y, SR, SP and PC to the first and/or last few bytes of those 8-bit addressable pages.
Quote:
I know folks love their registers, and their C-compilers, but it is not for me.
There has never been a mass-produced CPU that did not have a C compiler. Perhaps what you are trying to create is a purely artistic expression, rather than a CPU that might be used on someone else's project.
Quote:
Having 4 small stacks and 4 “fast pages” also makes “small multitasking” easy to implement, allowing you to run a couple of tasks concurrently without having to swap in and out of memory.
Again, I think you are optimizing the wrong thing. You have no control as a hardware designer how many tasks or threads that an application designer will want; so it might be better to focus on user vs. protected modes, or MMU versus no MMU, and permit software to make deeper choices.
Quote:
Of course an MMU could be added later which could isolate stacks between kernel and user space programs. It could also remap the 4 stacks and "fast pages" anywhere in memory, but then they aren’t as fast anymore. I really dislike it when the 65XX starts to look too much like just another “large system” processor.
OK, but you simply cannot run any of those modern operating systems you've listed without an MMU. I've researched this issue. All of them assume that an MMU exists. If you don't have one, we'll have to emulate it.

If you're adding features onto a 32-bit 6502 die, the single most important feature you'll want is a multiplier that operates on your 8-bit addressable memory. The longer the better. Almost all modern applications assume that 32-bit multiplies are fast. The 6502 takes forever to do even an 8-bit multiply. For bonus points, put several multipliers in parallel and have them all able to hit 8-bit addressable memory at once. Most modern embedded designs for machine learning depend on this. See in particular the dgemm() operation in BLAS, and think about how fast you can get that to be on your design.

When you are designing new hardware, you are designing something for a market that already has certain expectations about how hardware should be. It is almost always the case, that new hardware designs should follow software requirements solely; quirky or "opinionated" hardware often gets binned in history. Programmers like flat memory models. Operating systems like MMUs. Compilers like register banks. DSP applications like vector multipliers.

Don't start from hardware features. Think about applications, and then think about the hardware features that those applications need.

I hope some of this is helpful and inspires your designs.
Post Reply