Page 3 of 6

Re: emulator performance on embedded cpu

Posted: Wed Sep 12, 2012 6:15 pm
by BigDumbDinosaur
BigEd wrote:
be quick to catch this Kickstarter: from $22 for 48MHz ARM dev board
Image
(will be available commercially in due course.)

Teensy it sure is! :o I'm not sure an old dinosaur with claws instead of fingers could even pick up this thing, let alone connect it to anything. :lol:

Re: emulator performance on embedded cpu

Posted: Wed Sep 12, 2012 6:20 pm
by BigEd
(As I understand it, there are more GPIOs on the underside. Also, overclocking to 96MHz is possible if you want to live dangerously.)
Now, you're not saying you can't confidently handle a 28-pin DIP are you??!
Cheers
Ed

Re: emulator performance on embedded cpu

Posted: Fri Sep 14, 2012 4:13 pm
by BigEd
Or, $19 for 72MHz ARM dev board in DIP format (this fellow may have less of a track record, I'm not sure)
Image
Kickstarter open for 13 more days: http://www.kickstarter.com/projects/kuy ... ngs-better
Commercial site (price is $29 here): http://outbreak.co/galago

Edit: only 8kByte RAM

Re: emulator performance on embedded cpu

Posted: Fri Sep 14, 2012 10:27 pm
by BitWise
I like the ARM CPU but all the NXP and Atmel devices come with such tiny amounts of RAM on chip.

Re: emulator performance on embedded cpu

Posted: Sat Sep 15, 2012 5:54 pm
by BigEd
Good point: I've updated the recent posts.

Re: emulator performance on embedded cpu

Posted: Sun Jan 06, 2013 1:49 am
by BitWise
Over Xmas I went back to my 65C02 emulator on a Microchip 24F device and got it pass Klaus' test suite including the BCD arithmetic. The performance looks pretty good but I haven't run a 6502 benchmark yet to figure out its equivalent speed.

Currently I'm using a 56K serial link to download code to its monitor via a USB-Serial cable. I have access to a friends USB debugger and want to use it to add CDC support to the PICs firmware.

Re: emulator performance on embedded cpu

Posted: Thu May 15, 2014 4:53 pm
by BigEd
Hi Andrew,
with those larger parts available, can we suppose that your approach could now deliver a 48K RAM environment at about 6MHz 65C02 performance? (From a 70MHz clock)

I notice ST now offer ARM chips with 96K or so of RAM, clocked at 70 or 84 MHz, on dev boards for £10 or so. There are four chips on the boards available from Mouser, but I haven't chased down their specs:
NUCLEO-F401RE
NUCLEO-L152RE
NUCLEO-F103RB
NUCLEO-F030R8

(And for a little more, the Teensy offers up to 64k RAM in a DIP form factor and 5v tolerant I/O.
https://www.pjrc.com/teensy/teensy31.html#specs

Cheers
Ed

Re: emulator performance on embedded cpu

Posted: Thu May 15, 2014 6:36 pm
by BitWise
BigEd wrote:
Hi Andrew,
with those larger parts available, can we suppose that your approach could now deliver a 48K RAM environment at about 6MHz 65C02 performance? (From a 70MHz clock)

I notice ST now offer ARM chips with 96K or so of RAM, clocked at 70 or 84 MHz, on dev boards for £10 or so. There are four chips on the boards available from Mouser, but I haven't chased down their specs:
NUCLEO-F401RE
NUCLEO-L152RE
NUCLEO-F103RB
NUCLEO-F030R8

(And for a little more, the Teensy offers up to 64k RAM in a DIP form factor and 5v tolerant I/O.
https://www.pjrc.com/teensy/teensy31.html#specs

Cheers
Ed
My code only uses 32K of the 48K for emulator RAM. I could probably use more but it would take more instructions to select the correct section and that would adversely affect the speed.

The $8000-$FFFF address range is always mapped into the 512K flash. I was thinking of making it bank switched so I could emulate a BBC B with multiple sideways ROMS. The only catch is I would have to repeat the BBC operating system emulation in every 32K bank but there is room for 15 such banks in latest 33EP512G202 chip.

For other processor emulators I've been using PIC32MX795F512H chips with 128K RAM and 512K+12K of flash (not that I need that much) at 80 DMIPs but haven't got to the point of benchmarking them yet. I suspect they we be a little faster than the dsPIC.

And then of course there's the new PIC32MZ which has 2MB of flash, 512K RAM and 330 DMIPS. Should make a 25-35Mhz 65C816 SBC on a single chip feasible. I'm just waiting for someone to start making a reasonably priced module mean while I think I'll have to get one of these http://www.microchip.com/Developmenttoo ... O=MA320012 for £15 for initial testing.

Re: emulator performance on embedded cpu

Posted: Thu May 15, 2014 7:02 pm
by BigEd
Now that looks very attractive!

Re:

Posted: Sat May 31, 2014 8:24 pm
by cr1901
kc5tja wrote:
MEANING 1: Single-cycle instruction execution. Answer: no. It already achieves this, and in fact, Intel has been executing its instructions in a single cycle as far back as the 80486.
That's only because of the '486's RISC-like pipeline though. WDC claims that the '816 is pipelined like a RISC, but the examples I see in their '816 manual are more akin to the Execution Unit and Bus Interface Unit separation of the 8086/8.

If the '816 has a RISC-like pipeline, there could be potentially significant speed-ups to an extent, and WDC also claims that cache can be implemented with VDA/VPA signals on the '816, but... I feel that at higher clock speeds, the '816 will be starved for data more than a '486 at the same clock speed because the '816 can do a memory access in one clock cycle, and memory speed has NOT kept up with processor speed.

I love x86, but I don't know anyone who doesn't think it's a mess. Why WDC hasn't made a SIMPLE '816 extension with more registers (32-bit Z and 32-bit W, as 32-bit X and 32-bit Y alternatives, D accumulator for 32-bit operation), SIMPLE interface to cache, a SIMPLE pipeline, and SIMPLE branch predictor, and SIMPLE scoreboarding... who knows? All those concepts in principle aren't too difficult, except maybe implementing the pipeline and scoreboarding*. The era of being able to know exactly how many clock cycles your program will take died by the 8088.

*Full disclosure: I don't actually fully understand scoreboarding other than it permits dynamically rearranging instructions already in the pipeline.

Re: Re:

Posted: Tue Jun 03, 2014 2:20 am
by Dr Jefyll
cr1901 wrote:
WDC claims that the '816 is pipelined like a RISC, but the examples I see in their '816 manual are more akin to the Execution Unit and Bus Interface Unit separation of the 8086/8.
Both interpretations are rather generous. The '816 may occasionally overlap a bus access with an internal operation, but that accomplishment seems modest indeed when you consider it's common for a RISC pipeline to overlap three or more operations (eg: fetch-decode-execute), and sustain that almost continuously. As for the x86 BIU, the 8088 or 8086 can prefetch 4 or 6 bytes respectively from the code stream, whereas the '816 can prefetch only one byte.
cr1901 wrote:
I love x86, but I don't know anyone who doesn't think it's a mess.
From the coding perspective I like x86 alright. (I find Intel's industry dominance off-putting, but that's an entirely separate matter.) As for "a mess," the only downright daft aspect of x86 that comes to my mind is the fact that logical operations such as AND and XOR etc clear the Carry Flag. What were they thinking??

My guess is, it's for backward compatibility from 8086 -> 8080 -> 8008... and probably right back to the 4004! And I bet it originated not as a concious decision but as a hardware quirk, one that was subsequently tolerated (and later became entrenched).

-- Jeff

Re: Re:

Posted: Tue Jun 03, 2014 5:43 am
by barrym95838
Dr Jefyll wrote:
... the only downright daft aspect of x86 that comes to my mind is the fact that logical operations such as AND and XOR etc clear the Carry Flag. What were they thinking??
...
Let's face it, the 65xx method for condition code side-effects is the only one that makes sense. I did some 68xx programming a while ago, and was highly annoyed at not being able to perform an easy optimization because all of the store instructions messed with the flags! Having to TST an accumulator immediately after a PUL was equally annoying.

There are some processors that give you explicit control over flag effects, but the 65xx defaults just "feel right" to me, and I wouldn't change a thing, even if I could.

Mike

Re: emulator performance on embedded cpu

Posted: Fri Jun 06, 2014 8:20 am
by BigEd
For reference, a large collection of ARM dev boards with brief specs (but not prices - they are a couple of clicks away)
http://mbed.org/platforms/ (Edit: now redirects to https://os.mbed.com/platforms/)

Re: emulator performance on embedded cpu

Posted: Fri Jun 06, 2014 11:37 am
by BigEd
Another one which looks attractive - $24 for a 168MHz ARM with 2 MB of Flash memory, 256 KB of RAM onboard and also 8MByte of external SDRAM and an LCD display:
http://www.st.com/web/catalog/tools/FM1 ... 9/PF259090

Re: emulator performance on embedded cpu

Posted: Sat Aug 02, 2014 6:27 am
by JimDrew
It takes 8-27 (~14 average) instruction cycles per 6502 instruction with my 6502 emulation using a PIC24. At 40MIPS, that's 80 instruction cycles for a 2 cycle 6502 instruction (NOP, DEY, INY, etc.), which is about 6 MHz in speed. At 70 MIPS, I am around 10 MHz. But, I don't use my emulation for speed. I actually run the instruction fetch through a timer interrupt and set the compare value to the instruction time. This gives me a cycle exact CPU emulation and tons of time left to handle the emulation of other chips, update the LCD, etc.