emulator performance on embedded cpu

Topics pertaining to the emulation or simulation of the 65xx microprocessors and their peripheral chips.
User avatar
BigDumbDinosaur
Posts: 9428
Joined: 28 May 2009
Location: Midwestern USA (JB Pritzker’s dystopia)
Contact:

Re: emulator performance on embedded cpu

Post by BigDumbDinosaur »

BigEd wrote:
be quick to catch this Kickstarter: from $22 for 48MHz ARM dev board
Image
(will be available commercially in due course.)

Teensy it sure is! :o I'm not sure an old dinosaur with claws instead of fingers could even pick up this thing, let alone connect it to anything. :lol:
x86?  We ain't got no x86.  We don't NEED no stinking x86!
User avatar
BigEd
Posts: 11464
Joined: 11 Dec 2008
Location: England
Contact:

Re: emulator performance on embedded cpu

Post by BigEd »

(As I understand it, there are more GPIOs on the underside. Also, overclocking to 96MHz is possible if you want to live dangerously.)
Now, you're not saying you can't confidently handle a 28-pin DIP are you??!
Cheers
Ed
User avatar
BigEd
Posts: 11464
Joined: 11 Dec 2008
Location: England
Contact:

Re: emulator performance on embedded cpu

Post by BigEd »

Or, $19 for 72MHz ARM dev board in DIP format (this fellow may have less of a track record, I'm not sure)
Image
Kickstarter open for 13 more days: http://www.kickstarter.com/projects/kuy ... ngs-better
Commercial site (price is $29 here): http://outbreak.co/galago

Edit: only 8kByte RAM
Last edited by BigEd on Sat Sep 15, 2012 5:51 pm, edited 1 time in total.
User avatar
BitWise
In Memoriam
Posts: 996
Joined: 02 Mar 2004
Location: Berkshire, UK
Contact:

Re: emulator performance on embedded cpu

Post by BitWise »

I like the ARM CPU but all the NXP and Atmel devices come with such tiny amounts of RAM on chip.
Andrew Jacobs
6502 & PIC Stuff - http://www.obelisk.me.uk/
Cross-Platform 6502/65C02/65816 Macro Assembler - http://www.obelisk.me.uk/dev65/
Open Source Projects - https://github.com/andrew-jacobs
User avatar
BigEd
Posts: 11464
Joined: 11 Dec 2008
Location: England
Contact:

Re: emulator performance on embedded cpu

Post by BigEd »

Good point: I've updated the recent posts.
User avatar
BitWise
In Memoriam
Posts: 996
Joined: 02 Mar 2004
Location: Berkshire, UK
Contact:

Re: emulator performance on embedded cpu

Post by BitWise »

Over Xmas I went back to my 65C02 emulator on a Microchip 24F device and got it pass Klaus' test suite including the BCD arithmetic. The performance looks pretty good but I haven't run a 6502 benchmark yet to figure out its equivalent speed.

Currently I'm using a 56K serial link to download code to its monitor via a USB-Serial cable. I have access to a friends USB debugger and want to use it to add CDC support to the PICs firmware.
Andrew Jacobs
6502 & PIC Stuff - http://www.obelisk.me.uk/
Cross-Platform 6502/65C02/65816 Macro Assembler - http://www.obelisk.me.uk/dev65/
Open Source Projects - https://github.com/andrew-jacobs
User avatar
BigEd
Posts: 11464
Joined: 11 Dec 2008
Location: England
Contact:

Re: emulator performance on embedded cpu

Post by BigEd »

Hi Andrew,
with those larger parts available, can we suppose that your approach could now deliver a 48K RAM environment at about 6MHz 65C02 performance? (From a 70MHz clock)

I notice ST now offer ARM chips with 96K or so of RAM, clocked at 70 or 84 MHz, on dev boards for £10 or so. There are four chips on the boards available from Mouser, but I haven't chased down their specs:
NUCLEO-F401RE
NUCLEO-L152RE
NUCLEO-F103RB
NUCLEO-F030R8

(And for a little more, the Teensy offers up to 64k RAM in a DIP form factor and 5v tolerant I/O.
https://www.pjrc.com/teensy/teensy31.html#specs

Cheers
Ed
User avatar
BitWise
In Memoriam
Posts: 996
Joined: 02 Mar 2004
Location: Berkshire, UK
Contact:

Re: emulator performance on embedded cpu

Post by BitWise »

BigEd wrote:
Hi Andrew,
with those larger parts available, can we suppose that your approach could now deliver a 48K RAM environment at about 6MHz 65C02 performance? (From a 70MHz clock)

I notice ST now offer ARM chips with 96K or so of RAM, clocked at 70 or 84 MHz, on dev boards for £10 or so. There are four chips on the boards available from Mouser, but I haven't chased down their specs:
NUCLEO-F401RE
NUCLEO-L152RE
NUCLEO-F103RB
NUCLEO-F030R8

(And for a little more, the Teensy offers up to 64k RAM in a DIP form factor and 5v tolerant I/O.
https://www.pjrc.com/teensy/teensy31.html#specs

Cheers
Ed
My code only uses 32K of the 48K for emulator RAM. I could probably use more but it would take more instructions to select the correct section and that would adversely affect the speed.

The $8000-$FFFF address range is always mapped into the 512K flash. I was thinking of making it bank switched so I could emulate a BBC B with multiple sideways ROMS. The only catch is I would have to repeat the BBC operating system emulation in every 32K bank but there is room for 15 such banks in latest 33EP512G202 chip.

For other processor emulators I've been using PIC32MX795F512H chips with 128K RAM and 512K+12K of flash (not that I need that much) at 80 DMIPs but haven't got to the point of benchmarking them yet. I suspect they we be a little faster than the dsPIC.

And then of course there's the new PIC32MZ which has 2MB of flash, 512K RAM and 330 DMIPS. Should make a 25-35Mhz 65C816 SBC on a single chip feasible. I'm just waiting for someone to start making a reasonably priced module mean while I think I'll have to get one of these http://www.microchip.com/Developmenttoo ... O=MA320012 for £15 for initial testing.
Andrew Jacobs
6502 & PIC Stuff - http://www.obelisk.me.uk/
Cross-Platform 6502/65C02/65816 Macro Assembler - http://www.obelisk.me.uk/dev65/
Open Source Projects - https://github.com/andrew-jacobs
User avatar
BigEd
Posts: 11464
Joined: 11 Dec 2008
Location: England
Contact:

Re: emulator performance on embedded cpu

Post by BigEd »

Now that looks very attractive!
cr1901
Posts: 158
Joined: 05 Feb 2014

Re:

Post by cr1901 »

kc5tja wrote:
MEANING 1: Single-cycle instruction execution. Answer: no. It already achieves this, and in fact, Intel has been executing its instructions in a single cycle as far back as the 80486.
That's only because of the '486's RISC-like pipeline though. WDC claims that the '816 is pipelined like a RISC, but the examples I see in their '816 manual are more akin to the Execution Unit and Bus Interface Unit separation of the 8086/8.

If the '816 has a RISC-like pipeline, there could be potentially significant speed-ups to an extent, and WDC also claims that cache can be implemented with VDA/VPA signals on the '816, but... I feel that at higher clock speeds, the '816 will be starved for data more than a '486 at the same clock speed because the '816 can do a memory access in one clock cycle, and memory speed has NOT kept up with processor speed.

I love x86, but I don't know anyone who doesn't think it's a mess. Why WDC hasn't made a SIMPLE '816 extension with more registers (32-bit Z and 32-bit W, as 32-bit X and 32-bit Y alternatives, D accumulator for 32-bit operation), SIMPLE interface to cache, a SIMPLE pipeline, and SIMPLE branch predictor, and SIMPLE scoreboarding... who knows? All those concepts in principle aren't too difficult, except maybe implementing the pipeline and scoreboarding*. The era of being able to know exactly how many clock cycles your program will take died by the 8088.

*Full disclosure: I don't actually fully understand scoreboarding other than it permits dynamically rearranging instructions already in the pipeline.
User avatar
Dr Jefyll
Posts: 3526
Joined: 11 Dec 2009
Location: Ontario, Canada
Contact:

Re: Re:

Post by Dr Jefyll »

cr1901 wrote:
WDC claims that the '816 is pipelined like a RISC, but the examples I see in their '816 manual are more akin to the Execution Unit and Bus Interface Unit separation of the 8086/8.
Both interpretations are rather generous. The '816 may occasionally overlap a bus access with an internal operation, but that accomplishment seems modest indeed when you consider it's common for a RISC pipeline to overlap three or more operations (eg: fetch-decode-execute), and sustain that almost continuously. As for the x86 BIU, the 8088 or 8086 can prefetch 4 or 6 bytes respectively from the code stream, whereas the '816 can prefetch only one byte.
cr1901 wrote:
I love x86, but I don't know anyone who doesn't think it's a mess.
From the coding perspective I like x86 alright. (I find Intel's industry dominance off-putting, but that's an entirely separate matter.) As for "a mess," the only downright daft aspect of x86 that comes to my mind is the fact that logical operations such as AND and XOR etc clear the Carry Flag. What were they thinking??

My guess is, it's for backward compatibility from 8086 -> 8080 -> 8008... and probably right back to the 4004! And I bet it originated not as a concious decision but as a hardware quirk, one that was subsequently tolerated (and later became entrenched).

-- Jeff
In 1988 my 65C02 got six new registers and 44 new full-speed instructions!
https://laughtonelectronics.com/Arcana/ ... mmary.html
User avatar
barrym95838
Posts: 2056
Joined: 30 Jun 2013
Location: Sacramento, CA, USA

Re: Re:

Post by barrym95838 »

Dr Jefyll wrote:
... the only downright daft aspect of x86 that comes to my mind is the fact that logical operations such as AND and XOR etc clear the Carry Flag. What were they thinking??
...
Let's face it, the 65xx method for condition code side-effects is the only one that makes sense. I did some 68xx programming a while ago, and was highly annoyed at not being able to perform an easy optimization because all of the store instructions messed with the flags! Having to TST an accumulator immediately after a PUL was equally annoying.

There are some processors that give you explicit control over flag effects, but the 65xx defaults just "feel right" to me, and I wouldn't change a thing, even if I could.

Mike
User avatar
BigEd
Posts: 11464
Joined: 11 Dec 2008
Location: England
Contact:

Re: emulator performance on embedded cpu

Post by BigEd »

For reference, a large collection of ARM dev boards with brief specs (but not prices - they are a couple of clicks away)
http://mbed.org/platforms/ (Edit: now redirects to https://os.mbed.com/platforms/)
Last edited by BigEd on Thu Feb 08, 2018 2:43 pm, edited 1 time in total.
User avatar
BigEd
Posts: 11464
Joined: 11 Dec 2008
Location: England
Contact:

Re: emulator performance on embedded cpu

Post by BigEd »

Another one which looks attractive - $24 for a 168MHz ARM with 2 MB of Flash memory, 256 KB of RAM onboard and also 8MByte of external SDRAM and an LCD display:
http://www.st.com/web/catalog/tools/FM1 ... 9/PF259090
JimDrew
Posts: 107
Joined: 14 Oct 2012

Re: emulator performance on embedded cpu

Post by JimDrew »

It takes 8-27 (~14 average) instruction cycles per 6502 instruction with my 6502 emulation using a PIC24. At 40MIPS, that's 80 instruction cycles for a 2 cycle 6502 instruction (NOP, DEY, INY, etc.), which is about 6 MHz in speed. At 70 MIPS, I am around 10 MHz. But, I don't use my emulation for speed. I actually run the instruction fetch through a timer interrupt and set the compare value to the instruction time. This gives me a cycle exact CPU emulation and tons of time left to handle the emulation of other chips, update the LCD, etc.
Post Reply