6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Fri Nov 22, 2024 4:48 am

All times are UTC




Post new topic Reply to topic  [ 46 posts ]  Go to page 1, 2, 3, 4  Next
Author Message
PostPosted: Tue Oct 19, 2021 10:18 am 
Offline
User avatar

Joined: Wed Jul 01, 2020 6:15 pm
Posts: 79
Location: Germany
After a bit of a hiatus, I am back into 6502 emulation... I am thinking about a new revision of my 65F02 accelerator. Rough design goals would be:

  • Same concept as the original 65F02: DIP-40 board to plug into the CPU socket; knows the host's memory map (RAM/ROM vs. I/O), loads all host RAM and ROM into fast on-chip RAM; executes programs fast from internal RAM but accesses the original host peripherals at host clock speed.
  • More on-chip RAM to properly support bank-switched host systems with more than 64k total RAM+ROM.
  • Bi-directional level converters on all DIP pins, to support 6510, Atari's Sally, WDC 6502, 65816.

The Spartan-7 FPGA would fit the bill nicely, since it is available in a 225-ball BGA package which I could fit onto the PCB, in versions with up to 180 kByte RAM.

But I was wondering whether maybe a microcontroller is the better platform here. Should certainly be more cost-effective than a 50€ FPGA, and require less than three different supply voltages -- and it might well be able to get beyond the 100 MHz emulation speed I can currently squeeze out of the Spartan-6? It will remain to be seen how tightly I could sync it to the host's clock for external bus cycles, but I would hope to keep up with the relevant host clocks up to 5 MHz or so.

I have not followed the software emulation projects lately, and would appreciate an update and recommendations. What would be the microcontroller platform of choice, and what emulation speed could I expect from it? Some constraints:

  • Small package; 13*13 mm² incl. pins/pads is the maximum I can fit onto the PCB.
  • 45 free I/O pins, plus whatever it takes for infrastructure (power, programming, ...)
  • 3.3V operation is fine, I don't think there are any competitive 5V processors out there and am prepared to provide level translators.
  • 128 kByte on-chip RAM (or more) would be great.
  • As fast as possible within the above constraints! 8)

I am looking for processor recommendations as well as efficient/fast emulation cores. Many thanks for your ideas!


Top
 Profile  
Reply with quote  
PostPosted: Tue Oct 19, 2021 11:04 am 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10985
Location: England
Quick link back to your previous thread:

A couple of considerations spring to mind:
- some of your potential audience will like an FPGA as being real hardware but not like a microcontroller solution. (This is a matter of personal judgement, and no-one has ever changed their mind.)
- whether or not you (also) want a cycle accurate mode might also affect the answer - the fastest software solutions might not readily deliver such a mode.

Also, I would guess handling of more complex memory maps is likely to have more of a performance impact on a software approach compared to hardware.

Having said that, perhaps the fastest and certainly for me the most exciting recent developments are the JIT emulators for ARM. (See Dominic aka dp11 here and on stardot)

One of the most interesting recent microcontroller platforms is the RP2040, the microcontroller also used on the Raspberry Pi Pico but now available as a component.

These two threads might be of interest:

It might also be instructive to have a look through the latest PiTubeDirect branch (hognose-dev) which is nearing release. PiTubeDirect runs on the Raspberry Pi, all models of which use a bigger and more complex ARM than you'd find in a microcontroller, but it still might be informative. The project contains three generations of 6502 emulator: lib6502 in C (from here), a fast 6502 in hand-crafted ARM, and a JIT 6502 also hand-crafted.


Top
 Profile  
Reply with quote  
PostPosted: Tue Oct 19, 2021 11:29 am 
Offline
User avatar

Joined: Wed Jul 01, 2020 6:15 pm
Posts: 79
Location: Germany
Perfect -- thank you so much, Ed! That will give me plenty to read up and chew on...

I can certainly relate to the "I want real hardware, FPGA is better/cooler" viewpoint, since I am coming from that side too. Let's see whether or not I end up disproving your "no-one has ever changed their mind" observation. :wink:

Agree that complex memory maps will likely incur a speed penalty in a µC-based design. It's easy to get spoiled by the FPGA approach where I can just add more logic cells to do address comparisons in parallel -- until routing delays sneak up on me... Hopefully this can be mitigated somewhat in practice since the memory maps are not totally arbitrary: A coarse (and fast) pre-check for addresses that might potentially need special treatment should be possible in many cases, such that code execution in the bulk of the RAM/ROM space remains fast.

And you raise a good point regarding the trade-off between cycle-exact vs. fastest possible emulation. Since cycle-exact emulation would only be of interest in non-accelerated mode (or during non-accelerated code sequences for time-critical peripheral control), maybe I could combine two separate simulators which work on the same data and registers? A fast JIT compiler for accelerated operation, and a cyle exact interpreter which kicks in during the external bus cycles?

Ah, much to read and think about... Many thanks again for the starting points!
Juergen


Top
 Profile  
Reply with quote  
PostPosted: Tue Oct 19, 2021 1:15 pm 
Offline
User avatar

Joined: Mon May 12, 2014 6:18 pm
Posts: 365
Ed has given some great suggestions for higher level systems. If you want something that is a more traditional microcontroller, you might want to check out the new ARM M7 chips. Many companies make them but here are the ones from ST:

STM32H7 Series

They go up to 550MHz and some have an additional 240MHz M4 core on the chip.

Another option is the PIC32 which is a MIPS core, though the fastest parts only go up to about 250MHz. One caveat to these is that you only get up to O2 level optimizations if you're compiling C unless you pay for the compiler:

PIC32MZ EF
PIC32MZ DA

If you want to leverage any of the existing projects mentioned above, keep in mind that the Raspberry Pi Pico and M7 mentioned above are Cortex-M which only supports 16-bit wide Thumb-2 instructions, whereas most (or all?) of the Raspberry Pis other than Pico are Cortex-A which supports Thumb-2 as well as full width instructions. (Both instruction sets operate on the full 32 or 64 bit registers even if the instructions are smaller.) This may matter when it comes to reusing hand written assembly and JITs. PIC32 being MIPS won't be compatible at all with any existing ARM assembly or JITs.


Top
 Profile  
Reply with quote  
PostPosted: Tue Oct 19, 2021 1:23 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10985
Location: England
(crossed in post...)

Whatever you do it will be an interesting adventure and I hope you keep us up to date on developments!

A couple of other resources:
In that linked thread, it seems that getting emulation to run at 10% of the microcontroller clock speed is pretty good, and running at 20% of the host speed is very good. I'm not sure how the JIT approach fares but it will be better yet!

So, you might well need a better than 500MHz core to beat your 100MHz benchmark.

Edit: it seems the 133MHz RP2040 has been overclocked to 400MHz...


Top
 Profile  
Reply with quote  
PostPosted: Tue Oct 19, 2021 2:05 pm 
Offline

Joined: Fri Apr 06, 2018 4:20 pm
Posts: 94
65f02 wrote:
After a bit of a hiatus, I am back into 6502 emulation... I am thinking about a new revision of my 65F02 accelerator. Rough design goals would be:

  • Same concept as the original 65F02: DIP-40 board to plug into the CPU socket; knows the host's memory map (RAM/ROM vs. I/O), loads all host RAM and ROM into fast on-chip RAM; executes programs fast from internal RAM but accesses the original host peripherals at host clock speed.
  • More on-chip RAM to properly support bank-switched host systems with more than 64k total RAM+ROM.
  • Bi-directional level converters on all DIP pins, to support 6510, Atari's Sally, WDC 6502, 65816.

I am looking for processor recommendations as well as efficient/fast emulation cores. Many thanks for your ideas!


I would recommend looking at this project for ideas - this is a great use of a fast MCU to emulate a 6502:

https://github.com/MicroCoreLabs/Projects


Top
 Profile  
Reply with quote  
PostPosted: Tue Oct 19, 2021 9:38 pm 
Offline
User avatar

Joined: Wed Jul 01, 2020 6:15 pm
Posts: 79
Location: Germany
Thank you all for the additional ideas and context! I have not gotten around to digging into all of them, but have a few initial comments and follow-up questions:

I am aware of the MCL65+, and actually have been in touch with Ted recently while he was working on language card support for the Apple II. A very nice implementation on the Teensy 4, although I have to object to Ted's claim that it is the fastest Apple II. :wink: The datapoint I have there is 10x acceleration over the Apple's 1 MHz, which I assume to mean more than 10x for code execution, and no acceleration for I/O. (Using a 600 MHz Cortex M7 in the Teensy, although probably not heavily optimized code yet.)

Another datapoint I have is 130 MHz emulation speed on a PocketBeagle (1 GHz Cortex A8). This is from some chess computer enthusiasts, but I don't have any details on the emulation software they use; will try to learn more. The gap between microcontrollers and "proper" processors is apparently still larger than I had realized, probably not in the least due to the difference in ARM cores which Druzyek had pointed out.

Nevertheless the RP2040 microcontroller looks very tempting. At €1.10 in single quantities you certainly can't fault its bang-per-buck ratio! And the programmable state machines associated with the PIOs look promising: Could these be used to accelerate the response of the external bus interface to clock edges by "hard-coding" the required actions? It is a major pity that the RP2040 does not have quite enough I/Os for me, if I want a flexible platform which can drive all 40 pins. (And hence also can't easily group them and add external latches e.g. for the address bus.)

I could use some handholding regarding beebjit; found it difficult to figure out the current status. Does it run on ARM yet? (Which cores?) Have any sample implementations and performance data been published?

Thank you again for all your hints!
Juergen


Top
 Profile  
Reply with quote  
PostPosted: Tue Oct 19, 2021 9:57 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10985
Location: England
I don't think beebjit runs on ARM yet but it looks like it's in active development (on a branch)

Dominic's JITting emulator for PiTubeDirect is in the upcoming release. I believe I may have seen him mention JITting for the Pico/RP2040 but I don't think there's anything public.

The PIOs are indeed very handy and yes I think they are powerful enough to service writes and some kinds of reads in some circumstances. A pity the RP2040 doesn't have enough I/O for your purposes. Maybe if the two halves of the address bus are multiplexed and sent to an external latch for demultiplexing? They are unidirectional! (And the latch doubles as a level shifter, maybe.)

Edit: wording


Last edited by BigEd on Wed Oct 20, 2021 6:48 am, edited 1 time in total.

Top
 Profile  
Reply with quote  
PostPosted: Wed Oct 20, 2021 5:53 am 
Offline
User avatar

Joined: Wed Jul 01, 2020 6:15 pm
Posts: 79
Location: Germany
Yes, I have been thinking of an 8-bit latch or two to work around the RP2040's limited number of GPIO pins, and concluded that the address bits would lend themselves best to that.

But I was hoping to give the new PCB full flexiblity to also support processors outside of the 65xx family, using bi-directional drivers on all 40 pins. (Plus solder jumpers for GND and Vcc on selected pins, as required by, say, the 65xx, 68xx, Z80. Not unlike the venerable GODIL board, but with a more powerful core and adhering to the DIP-40 form factor.) In which case I no longer know where the address lines are going to be...

Also, the RP2040 won't be fast enough if >= 100 MHz emulation speed is the goal, I think. But it is such a tempting little chip... Maybe worth a separate project, e.g. as a pure replacement for some obsolete chip without much acceleration in mind?


Top
 Profile  
Reply with quote  
PostPosted: Wed Oct 20, 2021 6:52 am 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10985
Location: England
We will perhaps have to wait for the RP2040 successor, whatever that may be! It's certainly fast enough to act as a retro-speed system or peripheral, and quite a bit more, but almost certainly will struggle to beat 100MHz emulated performance.

Druzyek is of course right to survey the field - I'm not familiar with what's on offer, but microcontrollers surely keep getting cheaper and faster at an encouraging rate. (The RP2040 and its PIOs have the interesting feature of being very deterministic and cycle-countable. A more aggressive higher-performance architecture may not have that, which may make hard real-time bit-banging more of a challenge.)


Top
 Profile  
Reply with quote  
PostPosted: Wed Oct 20, 2021 1:07 pm 
Offline
User avatar

Joined: Mon May 12, 2014 6:18 pm
Posts: 365
Apparently someone has gotten an RP2040 up to 400MHz (I don't have a link though). As you may know, the RP2040 does not have any internal flash and uses XIP to read instructions from an external serial memory. The instructions are automatically cached, so it's not much of a bottleneck from what I've heard. 133MHz is a common speed for those which is why I would guess that is the max speed. You might be able to read some code into RAM then execute it at a higher speed.


Top
 Profile  
Reply with quote  
PostPosted: Wed Oct 20, 2021 8:52 pm 
Offline

Joined: Sat Nov 11, 2017 1:08 pm
Posts: 33
My Rp2040 6502 core referred to earlier has the aim to be fast but not cycle accurate. On average the 6502 clock rate is 1/4 of the arm M0 clock rate. My pi jit 6502 core is 1/2.5 of the arm clock for a pizero, 1/1.4 for a pi3b and 1 to 1 ish for a pi4. Self modifying code reduces these figures. The entire jitter core is <15Kbytes but does require 256k+128k of works space plus the original 64k of ram.


Top
 Profile  
Reply with quote  
PostPosted: Wed Oct 20, 2021 9:06 pm 
Offline
User avatar

Joined: Wed Jul 01, 2020 6:15 pm
Posts: 79
Location: Germany
dp11 wrote:
My pi jit 6502 core is 1/2.5 of the arm clock for a pizero, 1/1.4 for a pi3b and 1 to 1 ish for a pi4. Self modifying code reduces these figures. The entire jitter core is <15Kbytes but does require 256k+128k of works space plus the original 64k of ram.


Wow -- so 400 MHz emulation speed for well-behaved 6502 code on the Pi Zero with its 1 GHz clock? That is impressive indeed! Is the source code for that project available or still in the works?

The Pi Zero SOC might work nicely for my purposes, with 512k RAM on-chip and a compact package. But is that chip available separately at all, or would one have to scavenge Pi Zero boards? (A Pi Zero piggy-backed on an adapter & level translator PCB would be a nice development platform, but eventually I would like a proper DIP-40 form factor.)


Top
 Profile  
Reply with quote  
PostPosted: Wed Oct 20, 2021 9:15 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10985
Location: England
That would need to be scavenging, I believe. (Thanks Dominic for the numbers!)


Top
 Profile  
Reply with quote  
PostPosted: Wed Oct 20, 2021 10:16 pm 
Offline

Joined: Sat Nov 11, 2017 1:08 pm
Posts: 33
I should say pizero non jitter is about 290MHz.the code is on GitHub search for PiTubeDirect. I have the last few bit to do to the jitter. Soon it should be merged into master.

You can't buy the Pizero Silcon unless you talk to broadcom directly and want to buy a lot of them AFAIK.


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 46 posts ]  Go to page 1, 2, 3, 4  Next

All times are UTC


Who is online

Users browsing this forum: No registered users and 4 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: