6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Sat May 11, 2024 10:59 am

All times are UTC




Post new topic Reply to topic  [ 41 posts ]  Go to page Previous  1, 2, 3  Next
Author Message
PostPosted: Tue Sep 22, 2020 8:38 pm 
Offline
User avatar

Joined: Sun Nov 27, 2011 12:03 pm
Posts: 229
Location: Amsterdam, Netherlands
BigEd wrote:
Any idea if careful HDL (re)coding could improve fmax?

That has been the main focus of recent optimizations (cycle counts lowered only a bit, here and there).

In particular, at least for Stratix V (compiles for other FPGAs may very well react differently), separating write enables and data for A/X/Y/P writing instructions reduced multiplexer complexity quite a bit, and yielded roughly 30 MHz. The current critical path reported by the timing analyzer seems like the end of the line (i.e. cannot be optimized further).


Top
 Profile  
Reply with quote  
PostPosted: Tue Sep 22, 2020 8:40 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10800
Location: England
There's always a brick wall eventually! What does the critical path look like - is it more or less expected?


Top
 Profile  
Reply with quote  
PostPosted: Tue Sep 22, 2020 9:03 pm 
Offline
User avatar

Joined: Sun Nov 27, 2011 12:03 pm
Posts: 229
Location: Amsterdam, Netherlands
BigEd wrote:
There's always a brick wall eventually! What does the critical path look like - is it more or less expected?

Kind of, yes. It always requires quite a bit of lateral thinking to interpret these paths, but it looks like it's the +X/+Y bypass (which may also conspire with the additional +1 needed to address the top byte of a 2-byte ABS,X or ZPG,X read). And this is all on top of an incoming instruction read. Removing the bypass is almost certainly the only possible relief there, but all X and Y changing instructions would incur an extra cycle, and it is very doubtful that the combination would pay off. Another way would be to drop from 2 to 1 byte for ABS,X and ZPG,X (of which the second byte isn't used very often), but I don't think that that would pay off either.


Top
 Profile  
Reply with quote  
PostPosted: Tue Sep 29, 2020 11:07 pm 
Offline
User avatar

Joined: Sun Nov 27, 2011 12:03 pm
Posts: 229
Location: Amsterdam, Netherlands
200 MHz now. But that must truly be the (far) end of the line. 420 MHz benchmark. Take that emulator guys (just kidding, 370 is damn respectable, and the underlying hardware is far more practical).


Top
 Profile  
Reply with quote  
PostPosted: Wed Sep 30, 2020 10:20 am 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10800
Location: England
So, the IPC (instructions per clock) is about double the 6502's? That's a good stake in the ground for what can be done.

200MHz is a nice round number too.


Top
 Profile  
Reply with quote  
PostPosted: Wed Sep 30, 2020 11:41 am 
Offline
User avatar

Joined: Sun Nov 27, 2011 12:03 pm
Posts: 229
Location: Amsterdam, Netherlands
BigEd wrote:
So, the IPC (instructions per clock) is about double the 6502's? That's a good stake in the ground for what can be done.

It probably is. At least with this specific / impractical (for new designs) / FPGA bound architecture. But a good pipelined architecture, whose only problems would be inter-instruction dependencies and pipeline flushes, could probably get close to 1 cycle per instruction. Or beyond, if you go really crazy with the memory bandwidth and execution units. A fun puzzle for sure.


Top
 Profile  
Reply with quote  
PostPosted: Wed Sep 30, 2020 12:27 pm 
Offline

Joined: Thu Mar 12, 2020 10:04 pm
Posts: 690
Location: North Tejas
Windfall wrote:
200 MHz now. But that must truly be the (far) end of the line. 420 MHz benchmark. Take that emulator guys (just kidding, 370 is damn respectable, and the underlying hardware is far more practical).


As a reference point, my 6502 debugging simulator cranks out about 230 MHz while running on a netbook containing a 1 GHz AMD C60 processor. Today's fastest machines should do about four times better for around 1 GHz.


Top
 Profile  
Reply with quote  
PostPosted: Wed Sep 30, 2020 12:33 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10800
Location: England
(I suspect John is referring to PiTubeDirect, which runs on the super-cheap Raspberry Pi, also an approx 1GHz machine, but with the 6502 emulation written in super-tight ARM code, mostly by dp11 here. On a Pi Zero the latest code runs at 290MHz, on a less-cheap 1.5GHz Pi 4 it runs at 370MHz, approx. It's not a clock-for-clock accurate kind of emulator, it's an as-fast-as-you-can kind.)


Top
 Profile  
Reply with quote  
PostPosted: Wed Sep 30, 2020 12:40 pm 
Offline
User avatar

Joined: Sun Nov 27, 2011 12:03 pm
Posts: 229
Location: Amsterdam, Netherlands
BigEd wrote:
(I suspect John is referring to PiTubeDirect, which runs on the super-cheap Raspberry Pi, also an approx 1GHz machine, but with the 6502 emulation written in super-tight ARM code, mostly by dp11 here.)

Sort of, but the 370 is for a 1.5 GHz RPi 4 (as I understand it, I have a 'PiTubeDirect' but I haven't tried it yet on my RPi 4). It is really quite amazing how well ARM can emulate 6502. Not exactly a coincidence, of course (considering that the ARM instruction set was inspired by the 6502).


Top
 Profile  
Reply with quote  
PostPosted: Thu Oct 01, 2020 12:19 pm 
Offline
User avatar

Joined: Sun Nov 27, 2011 12:03 pm
Posts: 229
Location: Amsterdam, Netherlands
It is also quite interesting to see, right here, that emulation on dedicated hardware can compete on speed with implementations on flexible hardware (FPGAs). The price you pay for having all the flexibility, in terms of coins and performance, is pretty high. It will be interesting to see if this changes, especially since clock speed on contemporary processors has been flattening out.


Top
 Profile  
Reply with quote  
PostPosted: Thu Oct 15, 2020 7:55 pm 
Offline
User avatar

Joined: Sun Nov 27, 2011 12:03 pm
Posts: 229
Location: Amsterdam, Netherlands
Another interesting tidbit is that a reduced RAM version of the core, using 2.0 instead of 3.5 Mb (both resulting in 64 KB of usable memory), is slower by only 5% (benchmarked). This involves moving absolute operand reads (+0, +X, +Y) from the first to the second execution cycle, and then multiplexing all absolute and indirect addresses into one (so only one true dual ported RAM block is used for all six addressing modes). Of course, the speed reduction depends ultimately on the frequency with which absolute addressing is used in code, but on average, an extra cycle for all instructions using absolute addressing seems relatively inconsequential for speed.


Top
 Profile  
Reply with quote  
PostPosted: Thu Oct 15, 2020 7:59 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10800
Location: England
Is that almost saying that you've managed to halve the size and lose only 5% performance? Not a bad tradeoff, I'd say!


Top
 Profile  
Reply with quote  
PostPosted: Thu Oct 29, 2020 8:14 pm 
Offline
User avatar

Joined: Sun Nov 27, 2011 12:03 pm
Posts: 229
Location: Amsterdam, Netherlands
This core is now part of my 'soft' Acorn 6502 Second Processor for hardware development boards. If you have an Acorn BBC, and one of the supported development boards, you may want to have a look here :

http://www.zeridajh.org/hardware/soft6502secondprocessor/index.htm


Top
 Profile  
Reply with quote  
PostPosted: Fri Jan 14, 2022 4:11 pm 
Offline
User avatar

Joined: Sun Nov 27, 2011 12:03 pm
Posts: 229
Location: Amsterdam, Netherlands
I've now put the source for this 65C02 core on my website. See https://www.zeridajh.org/articles/me_various_sources/ under 'Verilog HDL'.


Top
 Profile  
Reply with quote  
PostPosted: Fri Jan 14, 2022 4:50 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10800
Location: England
Thanks for sharing your sources!


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 41 posts ]  Go to page Previous  1, 2, 3  Next

All times are UTC


Who is online

Users browsing this forum: No registered users and 2 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: