6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Fri Nov 22, 2024 3:17 am

All times are UTC




Post new topic Reply to topic  [ 23 posts ]  Go to page 1, 2  Next
Author Message
 Post subject: "Improved" 6502
PostPosted: Mon Mar 04, 2013 2:52 pm 
Offline

Joined: Sat Mar 27, 2010 7:50 pm
Posts: 149
Location: Chexbres, VD, Switzerland
Just for curiosity, has anyone ever tried to do (in HDL or wathever) a 6502 that is compatible with the original but is more efficient, using modern processors technique ?
1) Remove the dummy fetches in instructions that don't need it (such as ROL, ROR memory instructions, etc..)
2) Have a separate data and address cache, so that it is possible to do the last reading/writing cycle of an instruction while fetching the next one (with some kind of pipeline)

I think it would be cool.


Top
 Profile  
Reply with quote  
 Post subject: Re: "Improved" 6502
PostPosted: Mon Mar 04, 2013 3:27 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10985
Location: England
I'm not aware of an aggressive core. The caching would be the big win, not merely because of having better bandwidth over multiple busses to memory, but crucially because the FPGA cores are all able to run a lot faster than the typical off-chip RAM you'd use. So a faster core is going to be spinning its wheels until you can rig up a faster route to RAM.

There's a list of cores at http://6502.org/homebuilt#HDL and a discussion at viewtopic.php?t=1673

Cheers
Ed


Top
 Profile  
Reply with quote  
 Post subject: Re: "Improved" 6502
PostPosted: Mon Mar 04, 2013 4:12 pm 
Offline
User avatar

Joined: Tue Nov 16, 2010 8:00 am
Posts: 2353
Location: Gouda, The Netherlands
A nice idea would be to keep the entire zero page in local memory, with dual byte access, so you can do both ZP and ZP+1 fetches for (ZP),Y modes at the same time. If you keep them in LUT RAM, it can even be done in the same clock cycle.

One thing to keep in mind, though, is that most of these improvements will add extra logic, and therefore impact max clock speed.


Top
 Profile  
Reply with quote  
 Post subject: Re: "Improved" 6502
PostPosted: Mon Mar 04, 2013 4:20 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10985
Location: England
Same for page 1, I think. Not quite such a boost, but I think it would help.

You're right of course about the potential for slowdown: but whether or not it does depends on what's actually critical (and whether the present critical path could be assisted in any way)


Top
 Profile  
Reply with quote  
 Post subject: Re: "Improved" 6502
PostPosted: Mon Mar 04, 2013 5:00 pm 
Offline
User avatar

Joined: Tue Nov 16, 2010 8:00 am
Posts: 2353
Location: Gouda, The Netherlands
Ideas are nice, but the big problem is of course finding someone with enough time and motivation to actually sit down and do the hard work :)

Personally I have little motivation to work on anything like this. Running a plain 6502 at 100 MHz is good enough for nostalgic reasons. And for cases where compatibility with 80's designs is not an issue, much better results can be had by throwing away the whole design, and start from scratch with a 16 or 32 bit RISC.


Top
 Profile  
Reply with quote  
 Post subject: Re: "Improved" 6502
PostPosted: Mon Mar 04, 2013 5:02 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10985
Location: England
It's all true!


Top
 Profile  
Reply with quote  
 Post subject: Re: "Improved" 6502
PostPosted: Mon Mar 04, 2013 6:03 pm 
Offline
User avatar

Joined: Fri Aug 30, 2002 1:09 am
Posts: 8543
Location: Southern California
Welcome!

Quote:
1) Remove the dummy fetches in instructions that don't need it (such as ROL, ROR memory instructions, etc..)

The Commodore 65CE02 from 20+ years ago eliminated almost all the dead bus cycles, even having over 30 op codes that took only one clock instead of the normal minimum of two; so without re-writing code to take advantage of its new instructions, it still gave a speed-up of about 25%. It was only 10MHz though, whereas the current production ones are conservatively spec'ed for at least 14MHz and usually top out at 25MHz if the supporting parts can keep up.

The next step of course is to re-write the code to take advantage of the new instructions, or go with the 65816. I have a post on the huge difference you can get if you're constantly using 16-bit numbers (as in a higher-level language) at viewtopic.php?f=9&t=1505&p=9705#p9705. There's an example shown there where the '816 does in two instructions what the 6502 takes ten to do.

There was the 65GZ032 project with its own Yahoo forum which was for a modern 32-bit processor that could still run old 6502 code. It had a ton of registers, deep pipelining, branch prediction, onboard cache, etc., and ended up with something that has little resemblance to the 6502, but, after a lot of progress and even some working hardware, still fizzled out before it was done. I kind of lost interest when they went in directions that abandoned the 6502 flavor (outside of the 6502 emulation mode).

We were discussing an all-32-bit 65-family processor (the 65Org32), but as Arlet pointed out the problem is a shortage of time and motivation to do the hard work. ElEctric_EyE here is working toward the 65Org32 in steps, first to do a 16-bit NMOS 6502 equivalent. He's working on a video-chip project at the moment though.

A standard part of the program-structure words in Forth is DO...LOOP, with 16-bit (if you implement it on 6502) loop counter, index, and limit which are normally kept on the hardware stack in page 1. I did an equivalent 32-bit set of words (DO, ?DO, LOOP, +LOOP, I, BOUNDS, LEAVE, ?LEAVE, UNLOOP) for 6502, and the number of instructions it took was incredible. DO, which sets up the loop, took about 30 instructions (not cycles, but instructions), and LOOP which does the incrementing of the loop counter and compares it to the limit to see if it's time to exit the loop, took about 44 instructions (again, not cycles). With a 65Org32, it would be trivial, like doing a loop on 6502 with an 8-bit counter-- not even a half-dozen instructions total (plus whatever you actually do in the loop).

Additionally of course there would be multiply and divide instructions that would replace the long routines the 6502 requires, shown at viewtopic.php?f=9&t=689 and http://6502.org/source/integers/ummodfix/ummodfix.htm.

There are other things that can be done to get better performance with even old technology though, like the 16-bit look-up tables for accurately getting math functions, hundreds of times as fast as actually having to calculate them. These tables take a lot of memory, but the cost and size of memory has come way, way down to where it's somewhat practical now.

Although Arlet is not wrong about throwing the whole thing out and going with a newer processor, my point is that huge, dramatic improvements in performance could still be gained with a true 65-family processor, and some of those can be had even with existing, off-the-shelf current production 65c02's and 65816's.

_________________
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?


Top
 Profile  
Reply with quote  
 Post subject: Re: "Improved" 6502
PostPosted: Mon Mar 04, 2013 6:12 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10985
Location: England
Oh yes, welcome!

Aside from the cache and the clock speed, the other big win as Garth points out could come from a multiply instruction. As with most 6502-related activities, it would be a labour of love: someone might yet do it, even if it isn't the shortest path to some practical goall.

Cheers
Ed


Top
 Profile  
Reply with quote  
 Post subject: Re: "Improved" 6502
PostPosted: Mon Mar 04, 2013 6:39 pm 
Offline
User avatar

Joined: Tue Nov 16, 2010 8:00 am
Posts: 2353
Location: Gouda, The Netherlands
I think it would be good to make a distinction between staying 100% compatible with the original 65(C)02, but reduce the clock cycles, and adding additional instructions. I think the first question is more interesting, because it would directly speed up all existing software, without having to rewrite any of it. Also, it has a much more limited (and down to earth) scope. Coming up with additional instructions is trivial, especially if you don't have to do the work. Coming up with interesting (yet practical) ways to speed up existing code is more challenging.

Quote:
Although Arlet is not wrong about throwing the whole thing out and going with a newer processor, my point is that huge, dramatic improvements in performance could still be gained with a true 65-family processor, and some of those can be had even with existing, off-the-shelf current production 65c02's and 65816's.

My comment should be seen in the context of FPGA implementation. I agree that the 65816 offers a nice improvement, but it's not particularly FPGA-friendly in its design. To properly implement a 65816 on an FPGA would take more time than to design something better from scratch. Also, using the 65816 only really pays off if you're willing to rewrite the code.


Top
 Profile  
Reply with quote  
 Post subject: Re: "Improved" 6502
PostPosted: Mon Mar 04, 2013 7:35 pm 
Offline
User avatar

Joined: Fri Aug 30, 2002 1:09 am
Posts: 8543
Location: Southern California
Quote:
Also, using the 65816 only really pays off if you're willing to rewrite the code.

My (possibly incomplete) perception of the reason people want 100% compatibility is primarily so they can run vintage games on vintage computers (and they often even want the illegal op codes); but those games often resort to software timing loops anyway, meaning a big speed-up might not give a desirable net effect. Instead of giving smoother lines and movement, it would be just as blocky and jumpy, just faster.

There are plenty of new applications being written though, and the benefit of sticking with the same processor family, even if you have new instructions or changed op codes, is the familiarity that makes the programmer more productive. I've been designing PIC16 microcontrollers into commercial products for 15 years, and yet last week I discovered another caveat that is not spelled out in the data books. It's the same reason I don't take it lightly when someone in the company wants to change op amps, switching-regulator ICs, etc. in one of our circuits. The ones I've worked with for years may not be the best, but through experience we have discovered secrets that are nebulous or non-existent in the data sheets to getting the performance we need without going through another long learning curve.

_________________
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?


Top
 Profile  
Reply with quote  
 Post subject: Re: "Improved" 6502
PostPosted: Tue Mar 05, 2013 1:40 am 
Offline

Joined: Mon Mar 02, 2009 7:27 pm
Posts: 3258
Location: NC, USA
GARTHWILSON wrote:
My (possibly incomplete) perception of the reason people want 100% compatibility is primarily so they can run vintage games on vintage computers (and they often even want the illegal op codes); but those games often resort to software timing loops anyway, meaning a big speed-up might not give a desirable net effect. Instead of giving smoother lines and movement, it would be just as blocky and jumpy, just faster.

There are plenty of new applications being written though, and the benefit of sticking with the same processor family, even if you have new instructions or changed op codes, is the familiarity that makes the programmer more productive...

I agree 110% Garth! Developing a core requires more than 1 persons effort. We've seen that in the first iteration of the 65016 core; Bitwise and TeamTempest have skills in the assembler arena.

_________________
65Org16:https://github.com/ElEctric-EyE/verilog-6502


Top
 Profile  
Reply with quote  
 Post subject: Re: "Improved" 6502
PostPosted: Wed Mar 06, 2013 8:38 am 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10985
Location: England
(I still stand by multiply as a small and easy addition which gives a big performance win where applicable, but I completely agree that spooling out ideas for improvements is enormously easier than implementing them - especially as they also need to be implemented in a toolchain, and of course should also be documented!)


Top
 Profile  
Reply with quote  
 Post subject: Re: "Improved" 6502
PostPosted: Wed Mar 06, 2013 11:32 am 
Offline

Joined: Mon Mar 02, 2009 7:27 pm
Posts: 3258
Location: NC, USA
I agree with that too Ed. I am almost at the point where I am going to make the .d core which is the .b core plus multiply opcodes. I started to explain it here.

I'm currently writing a plot routine to plot characters with the .b core in my parallel video board(s) project. It makes sense that after this is done, I'll have something to compare speeds and a reasonable base with which to experiment and prove that the multiplication opcodes work as expected.

_________________
65Org16:https://github.com/ElEctric-EyE/verilog-6502


Top
 Profile  
Reply with quote  
 Post subject: Re: "Improved" 6502
PostPosted: Wed Mar 06, 2013 12:25 pm 
Offline
User avatar

Joined: Tue Nov 16, 2010 8:00 am
Posts: 2353
Location: Gouda, The Netherlands
this overview mentions a "65CX8 with eleven additional instructions such as MPY (multiply)", but the link returns an error.


Top
 Profile  
Reply with quote  
 Post subject: Re: "Improved" 6502
PostPosted: Wed Mar 06, 2013 1:11 pm 
Offline
User avatar

Joined: Tue Mar 02, 2004 8:55 am
Posts: 996
Location: Berkshire, UK
Arlet wrote:
this overview mentions a "65CX8 with eleven additional instructions such as MPY (multiply)", but the link returns an error.

Try using the waybackmachine on the URL at http://archive.org/.

I can't from work - it's blocked from here

_________________
Andrew Jacobs
6502 & PIC Stuff - http://www.obelisk.me.uk/
Cross-Platform 6502/65C02/65816 Macro Assembler - http://www.obelisk.me.uk/dev65/
Open Source Projects - https://github.com/andrew-jacobs


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 23 posts ]  Go to page 1, 2  Next

All times are UTC


Who is online

Users browsing this forum: Google [Bot] and 20 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
cron