Page 3 of 4
Re: 6502 emulation: looking for microcontroller recommendati
Posted: Sat Oct 23, 2021 3:42 pm
by hoglet
On interrupts, yes, the microcontroller can take an interrupt and perhaps do something useful in the ISR, but to communicate that to the emulated 6502, the 6502 probably needs to take an interrupt to go into its ISR to do what it needs to do, and this has to happen on an emulated instruction boundary - and preferably at low cost.
Possibly "needs to" is a bit strong here.
Dominic's new JIT-6502 Co Processor in PiTubeDirect, for example, will happily allow a 6502 instruction to be interrupted mid-execution, and this hasn't yet caused any incompatibilities. It works, I think, because an interrupt service routine rarely cares exactly where the main program is at. As long as the 6502 instructions execute correctly when interrupted, that should be sufficient.
I'm sure there will be counter examples, but it's worth considering this as an option.
Dave
Re: 6502 emulation: looking for microcontroller recommendati
Posted: Sat Oct 23, 2021 3:49 pm
by BigEd
Ah, yes, I had temporarily forgotten this new and clever scheme: is it not, in a sense, that a second 6502 emulation context runs the 6502 ISR code?
Re: 6502 emulation: looking for microcontroller recommendati
Posted: Sat Oct 23, 2021 4:13 pm
by hoglet
Ah, yes, I had temporarily forgotten this new and clever scheme: is it not, in a sense, that a second 6502 emulation context runs the 6502 ISR code?
Each invocation of the 6502 IRQ or NMI handler is a seperate emulation context with undefined register A/X/Y register values on entry. There is, however, still a shared stack, and a single stack pointer register.
It ended up like this through necessity rather than choice, and it's obviously a little different from a real 6502.
Dave
Re: 6502 emulation: looking for microcontroller recommendati
Posted: Sun Oct 24, 2021 5:27 pm
by BigEd
I've just remembered one of the observations about the teensy vs the pico: the teensy's pinout, in the versions we looked at, don't make it easy to drive or sample say a 16 bit bus in one go. There's more shifting and masking than would be ideal. That said, with a 600MHz processor overclockable to 900MHz, perhaps a few extra instructions isn't a big deal.
Re: 6502 emulation: looking for microcontroller recommendati
Posted: Sun Oct 24, 2021 6:37 pm
by MicroCoreLabs
I've just remembered one of the observations about the teensy vs the pico: the teensy's pinout, in the versions we looked at, don't make it easy to drive or sample say a 16 bit bus in one go. There's more shifting and masking than would be ideal. That said, with a 600MHz processor overclockable to 900MHz, perhaps a few extra instructions isn't a big deal.
Yes, I used this shifting and masking on the MCL65+ to improve the Teensy 4.1's IO timing:
Code: Select all
// -------------------------------------------------
// Drive the 6502 Address pins
// -------------------------------------------------
inline void send_address(uint32_t local_address) {
register uint32_t writeback_data=0;
writeback_data = (0x6DFFFFF3 & GPIO6_DR); // Read in current GPIOx register value and clear the bits we intend to update
writeback_data = writeback_data | (local_address & 0x8000)<<10 ; // 6502_Address[15] TEENSY_PIN23 GPIO6_DR[25]
writeback_data = writeback_data | (local_address & 0x2000)>>10 ; // 6502_Address[13] TEENSY_PIN0 GPIO6_DR[3]
writeback_data = writeback_data | (local_address & 0x1000)>>10 ; // 6502_Address[12] TEENSY_PIN1 GPIO6_DR[2]
writeback_data = writeback_data | (local_address & 0x0002)<<27 ; // 6502_Address[1] TEENSY_PIN38 GPIO6_DR[28]
GPIO6_DR = writeback_data | (local_address & 0x0001)<<31 ; // 6502_Address[0] TEENSY_PIN27 GPIO6_DR[31]
writeback_data = (0xCFF3EFFF & GPIO7_DR); // Read in current GPIOx register value and clear the bits we intend to update
writeback_data = writeback_data | (local_address & 0x0400)<<2 ; // 6502_Address[10] TEENSY_PIN32 GPIO7_DR[12]
writeback_data = writeback_data | (local_address & 0x0200)<<20 ; // 6502_Address[9] TEENSY_PIN34 GPIO7_DR[29]
writeback_data = writeback_data | (local_address & 0x0080)<<21 ; // 6502_Address[7] TEENSY_PIN35 GPIO7_DR[28]
writeback_data = writeback_data | (local_address & 0x0020)<<13 ; // 6502_Address[5] TEENSY_PIN36 GPIO7_DR[18]
GPIO7_DR = writeback_data | (local_address & 0x0008)<<16 ; // 6502_Address[3] TEENSY_PIN37 GPIO7_DR[19]
writeback_data = (0xFF3BFFFF & GPIO8_DR); // Read in current GPIOx register value and clear the bits we intend to update
writeback_data = writeback_data | (local_address & 0x0100)<<14 ; // 6502_Address[8] TEENSY_PIN31 GPIO8_DR[22]
writeback_data = writeback_data | (local_address & 0x0040)<<17 ; // 6502_Address[6] TEENSY_PIN30 GPIO8_DR[23]
GPIO8_DR = writeback_data | (local_address & 0x0004)<<16 ; // 6502_Address[2] TEENSY_PIN28 GPIO8_DR[18]
writeback_data = (0x7FFFFF6F & GPIO9_DR); // Read in current GPIOx register value and clear the bits we intend to update
writeback_data = writeback_data | (local_address & 0x4000)>>10 ; // 6502_Address[14] TEENSY_PIN2 GPIO9_DR[4]
writeback_data = writeback_data | (local_address & 0x0800)>>4 ; // 6502_Address[11] TEENSY_PIN33 GPIO9_DR[7]
GPIO9_DR = writeback_data | (local_address & 0x0010)<<27 ; // 6502_Address[4] TEENSY_PIN29 GPIO9_DR[31]
return;
}
Re: 6502 emulation: looking for microcontroller recommendati
Posted: Mon Oct 25, 2021 5:49 am
by 65f02
I've just remembered one of the observations about the teensy vs the pico: the teensy's pinout, in the versions we looked at, don't make it easy to drive or sample say a 16 bit bus in one go. There's more shifting and masking than would be ideal. That said, with a 600MHz processor overclockable to 900MHz, perhaps a few extra instructions isn't a big deal.
Yes, I used this shifting and masking on the MCL65+ to improve the Teensy 4.1's IO timing:
Thanks for pointing that out, Ed and Ted. Seeing Ted's code is actually encouraging for me: If one can get away with that, then it should not be a problem to use the Teensy CPU in an accelerator design which is meant to support different emulation targets with different pinouts.
Ted, I looked at your
code for the MCL65+ in github and noticed that you even do all the address shuffling
after having detected the positive clock edge. For write cycles you then send the data byte as eight separate bits too. And that still works nicely with the Apple's 1 MHz clock, right?
That is quite reassuring. I would trust that with some optimization (preparing as much as possible before waiting for the clock edge, and using assembler code if required) I should be able to keep up with the faster host clocks used e.g. in many chess computers -- 5 MHz 65C02s were commonly used in the late 1980s.
Re: 6502 emulation: looking for microcontroller recommendati
Posted: Mon Oct 25, 2021 6:29 am
by 65f02
Yes, when not running cycle-accurate the acceleration is for code execution and not for the I/O's.
How fast does the MCL65+ run for purely internal code execution? Say, just a loop with many iterations in BASIC, with some computations but no I/O?
Ted, would you have a data point on the above? When you run a program on the Apple which does internal computation only, at what effective speed does the MCL65+ execute it? Thanks!
Re: 6502 emulation: looking for microcontroller recommendati
Posted: Mon Oct 25, 2021 4:45 pm
by MicroCoreLabs
Hard to say how much faster the MCL65+ is compared to the stock 6502 at 1Mhz when running in an AppleII, VIC20, or a C64... Eventually some on-motherboard resource such as video needs to be accessed which will slow things down, but 6502 code which only accesses mirrored memory inside of the microcontroller runs extremely fast.
Here is a video of the difference running Brian's Theme:
https://microcorelabs.wordpress.com/202 ... -apple-ii/
And here is another one which is not a 6502 emulation and only uses the Apple's video and keyboard:
https://microcorelabs.wordpress.com/202 ... cl65-fast/
Re: 6502 emulation: looking for microcontroller recommendati
Posted: Mon Oct 25, 2021 6:11 pm
by 65f02
Thanks, Ted. Brian's Theme will of course access the video RAM quite a bit. (Although the effect is not too bad. With the 65F02 I found that, when mirroring the video RAM on-chip so that
reads from the video RAM are fast, I still get
more than 50x acceleration on Brian's Theme, compared to the "raw" acceleration of 100x.)
But if you time a simple FOR/NEXT loop in BASIC (without any output in the loop), there will be hardly any I/O overhead -- besides the interpreter polling the keyboard occasionally to check whether you have typed Ctrl-C, which is negligible. That allows you to measure the "raw" acceleration of the emulated 6502 over the original.
It would be great if you could do that little test; it would help in getting an idea how fast the Teensy 4 processor is. Thanks!
Re: 6502 emulation: looking for microcontroller recommendati
Posted: Mon Oct 25, 2021 6:41 pm
by MicroCoreLabs
With the 65F02 I found that, when mirroring the video RAM on-chip so that reads from the video RAM are fast, I still get more than 50x acceleration on Brian's Theme, compared to the "raw" acceleration of 100x
I would be interested to see a video of your 65F02 running Brian's Theme.
Re: 6502 emulation: looking for microcontroller recommendati
Posted: Mon Oct 25, 2021 7:15 pm
by 65f02
It's not pleasant to look at, flickering at 3 Hz.
I need to get the 65F02 configured for the Apple again; it's in a chess computer now. (I split the VHDL into two separate configurations, since the routing paths were getting congested in the logic which monitors the address pattern for the different I/O addresses.) And will need to figure out how to best share a video; I don't have a Youtube account and don't want one. Not sure when I will get around to that, but over the weekend at the latest.
Re: 6502 emulation: looking for microcontroller recommendati
Posted: Tue Oct 26, 2021 7:33 pm
by 65f02
Alright,
here is quick clip of "Brian's Theme", the 1979 Apple II demo, being loaded from disk and run with the 65F02 in the CPU socket. For comparison,
here is the original speed on a 1 MHz Apple II (someone else's video).
As stated before, the acceleration factor is >50 even with the graphics output. Apparently drawing an angled line still involves quite a bit of internal computation which gets accelerated. And as stated before, accelerating the reads from video RAM (by caching the RAM on-chip) has a surprisingly large benefit. Thanks again to Ed, by the way, who originally suggested caching in last year's thread on the 65F02!
But the thing I am most proud of is actually that the floppy disk still works with the accelerator on, since the 65F02 automatically finds all of Woz's cycle-counted code to drive the disk and time the nibbles.

Re: 6502 emulation: looking for microcontroller recommendati
Posted: Tue Oct 26, 2021 8:48 pm
by MicroCoreLabs
I guess when you run at 10ns per instruction you will get a blazingly fast result! Thats quite fast indeed...
If we assume there are at least 10 ARM instructions for every emulated opcode, then even the overclocked Teensy would be no faster than the equivalent 80Mhz... and it's likely there are many more than 10 instructions that the C code actually compiles to...
Mirroring everything except the writes to the video memory ranges and eliminating the original 6502's over-fetches and double-writes helps, but not by a lot.
Re: 6502 emulation: looking for microcontroller recommendati
Posted: Tue Oct 26, 2021 9:31 pm
by 65f02
I would be more optimistic regarding the performance. If you look at
dp11's post earlier in this thread, he mentions an emulated 6502 clock of 1/4 the emulator clock for his Pi Pico (RP2040) emulator.
That's for a tightly optimized emulator where the core parts are written in assembler, running on a not-so-powerful Cortex-M0+. Even neglecting the more efficient architecture of the Teensy's Cortex-M7, that should translate to 200 MHz 6502 clock on a Teensy overclocked to 800 MHz.
There is probably a bit more overhead involved in emulating all 6502 signals on the DIP socket vs. driving the BBC computer's "tube" interface. (Just guessing here; I have not looked into the "tube" in any detail.) Let's assume that eats up the performance benefit of the M7 vs. M0 cores, then we would still expect an emulated speed of 200 MHz.
A few assumptions in the above... I intend to try and port Dominic's emulator to a Teensy 4.1 board to hopefully get confirmation. That will take a while: There are some nice features in the i.MX RT1062 processor when you program it low-level. (E.g. a very flexible crossbar and basic logic unit so you can hard-wire the output clocks Phi1 and Phi2 to Phi0 and don't need to give them any attention in software.) But that comes at the cost of a 3400 page reference manual for the processor, and a heavyweight programming environment.

Re: 6502 emulation: looking for microcontroller recommendati
Posted: Tue Oct 26, 2021 11:12 pm
by dp11
Please do play with my stuff. But I will warn you that it is heavily optimisied for for the tube. If you spot any performance improvements do let me know. Decimal adc has the feeling of a few extra cycles could be removed, but I haven't put any time into it.