Page 1 of 2

A MUST-READ for anyone attempting 'mulation

Posted: Wed Dec 04, 2019 8:56 pm
by enso
Darek Mihoka, who is responsible for Atari emulators and much more, has written a lot of stuff you absolutely need to read, but most importantly:

http://emulators.com/docs/nx25_nostradamus.htm

The article works through branch misprediction issues created by opcode interpreters, which jump around willy-nilly and cause enormous penalties in the inner loop. There are ways to minimize the issue, making 10x improvements in emulation speed possible. After reading the article, you will probably enjoy hopping around his website mining for bits of wisdom and nostalgia.

Re: A MUST-READ for anyone attempting emulation

Posted: Wed Dec 04, 2019 10:50 pm
by BigDumbDinosaur
enso wrote:
Darek Mihoka, who is responsible for Atari emulators and much more...
It appears from reading the reference article that he is talking about simulation, not emulation. The latter is generally done in hardware that can be configured to behave like the original system. Although some may think this is hair-splitting, there is a fundamental difference.

Re: A MUST-READ for anyone attempting emulation

Posted: Wed Dec 04, 2019 11:52 pm
by enso
You are indeed correct. I see no way to edit the topic name to make it right, however.
[EDIT] Topic name edited.
[EDIT] After much flip-flopping, I feel defeated. Far be it for me to tell an author of software that they used a wrong term to describe it.

Re: A MUST-READ for anyone attempting emulation

Posted: Thu Dec 05, 2019 8:53 am
by BigEd
Thanks for the link, enso, to what seems to be a long series of interesting articles. I haven't yet started to read it.

(About the nitpick: language changes according to usage. Edit: see Garth below.)

Re: A MUST-READ for anyone attempting simulation

Posted: Thu Dec 05, 2019 9:29 am
by GARTHWILSON
Please see our topic "Terminology: Simulator vs. Emulator." I'll leave it at that.

Re: A MUST-READ for anyone attempting simulation

Posted: Fri Dec 06, 2019 1:30 am
by enso
I do prefer the old-school engineering terminology, although I've never heard of anyone admitting to developing a 'simulator' for a CPU or a game in the past 20 years or so...

Re: A MUST-READ for anyone attempting simulation

Posted: Fri Dec 06, 2019 8:40 am
by BigEd
I'm happy to call visual6502 a simulator... and anyone designing in HDL is likely to run a simulation, but that's a slightly different thing. (HDLs can be regarded as languages which describe a system's behaviour in a form that's amenable to, and intended for, simulation. As a side-effect they are languages for which synthesisers can be built, which infer a hardware design that should implement the behaviour described. Most often we ignore all that and think of the HDL as describing the hardware we want to have, and sometimes we even ignore simulation and go straight to synthesis and implementation...)

As I call visual6502 a simulator, I'll also call perfect6502 a simulator: it's a C model which, like the JavaScript original, simulates the transistor-level behaviour.

(Oh, and I did one run a circuit-level simulation using SPICE - that's a simulator too!)

Re: A MUST-READ for anyone attempting simulation

Posted: Fri Dec 06, 2019 6:53 pm
by hmn
The article claims that computed goto does not work as expected in GCC, this article from 2012 claims otherwise.

Also I am wondering how you would implement stuff like pausing and single stepping without a dispatch loop, which to me would be much more important features for an *mulator than execution speed.

Re: A MUST-READ for anyone attempting simulation

Posted: Fri Dec 06, 2019 7:55 pm
by BigEd
Depends - in the case of the article, it's emulation for the purpose of running applications, not for the purpose of debugging software (or a system.) It's similar for the likes of MAME - it needs to run at the full speed of the target system, and that usually means being very cycle-efficient, because the host system might be underpowered. (And the target system might be really hard to emulate - multiple independent chips and cycle-accurate emulation can be a tough one.)

Re: A MUST-READ for anyone attempting 'mulation

Posted: Sat Dec 07, 2019 7:52 am
by DerTrueForce
I have to say I don't understand much of it(if any), especially this "nostradamus distributor", which looks like a completely incomprehensible wall of code to me.

I get that he's saying that his construct is faster than the obvious switch-based interpreter, but he lost me around the point where he started talking about multiple dispatch points.

Thinking of which, he's based in x86(understandably), and makes brief mention of getting similar results on PowerPC, but he makes no mention of ARM. If he wrote that article in 2008(as suggested by the copyright date), that seems to be a pretty big omission. I'd assume it'd work similarly on one of the big ARMs, such as you'd find in a smartphone or a raspberry pi, but I genuinely don't know about the low-end ones you find in microcontrollers.


On a different note, his PREDECODE idea seems to be in a similar vein to what I've seen some people here say, about informing the compiler/CPU which side of a branch is more likely to be taken.

Re: A MUST-READ for anyone attempting 'mulation

Posted: Sat Dec 07, 2019 1:04 pm
by BigEd
I think there are a couple of things going on: the article series is aiming to emulate x86 (a complex instruction set) on modern x86 (a variety of performance-enhancing machinery in place.)

It turns out, contrary to what I thought, it's not a long series about emulation, it's a long series of newsletters with occasional bits about emulation. Possibly it needs an emulation-specific index.

In the case of emulating CISC, there is the idea of emulating micro-ops. And in the case of caching predigested snippets for emulation, there is an idea of doing that with micro-ops instead of opcodes. I think.

Even without varied and sophisticated branch prediction - one of the main points of the linked article - it's worth replicating the dispatch logic. To have every opcode snippet jump back, only in order to jump forward again, is unnecessary. It's easier to sort this with macros in assembly language, perhaps, than in a high level language. Perhaps preprocessor macros in C help. Well, they do, but it can look pretty odd:
https://www.piumarta.com/software/lib65 ... /lib6502.c

Re: A MUST-READ for anyone attempting 'mulation

Posted: Sat Dec 07, 2019 2:37 pm
by Dr Jefyll
BigEd wrote:
[...] it's worth replicating the dispatch logic. To have every opcode snippet jump back, only in order to jump forward again, is unnecessary.
I'll elaborate if I may, Ed. The jump back to a single copy of the dispatcher is not only unnecessary; it also sacrifices an advantage that's enjoyed in the contrasting situation -- ie, when each opcode snippet concludes by falling through to its own private copy of the dispatcher. This was an eye-opener for me. :shock:
Quote:
The nice thing about handler chaining is that it has a beneficial side-effect! Not only does it eliminate a jump back to the top of a loop, by spreading out the indirect jumps from one central point and into each of the handlers the host CPU how has dozens if not hundreds of places that is it dispatch from. You might say to yourself this is bad, I mean, this bloats the size of the interpreter's code and puts an extra strain on the host CPU's branch predictor, no?

Yes! But, here is the catch. Machine language opcodes tend to follow patterns. Stack pushes are usually followed by a call instruction. Pops are usually followed by a return instruction. A memory load instruction is usually followed by a memory store instruction. A compare is followed by a conditional jump (usually a Jump If Zero). Especially with compiled code, you will see patterns of instructions repeating over and over again. That means that if you are executing the handler for the compare instruction, chances are very good that they next guest instruction is a conditional jump. Patterns like this will no doubt make up a huge portion of the guest code being interpreted, and so what happens is that the host CPU's branch predictor will start to correctly predict the jump targets from one handler to another.

Re: A MUST-READ for anyone attempting 'mulation

Posted: Sat Dec 07, 2019 3:05 pm
by BigEd
Good point - it's an even better tactic when there's branch prediction in the air.

Re: A MUST-READ for anyone attempting 'mulation

Posted: Sat Dec 07, 2019 6:03 pm
by commodorejohn
BigEd wrote:
Good point - it's an even better tactic when there's branch prediction in the air.
I was a little confused by this, but I think it's mostly because I'm not up on the details of modern branch prediction. Is the idea that, with multiple copies of the dispatch routine out there, a sufficiently advanced predictor would tend to remember that copy A tends to go to handler Z more often than handler Q, or something along those lines? Or was he talking about hand-tuning the copies of the dispatch routines to favor common routes?

Re: A MUST-READ for anyone attempting 'mulation

Posted: Sat Dec 07, 2019 6:23 pm
by BigEd
It's the first: with (say) 256 copies of the dispatcher, one for each opcode, each one (potentially) has its own branch history which capture where its likely to go next. So, perhaps, the dispatcher following DEX might predict a jump to BNE.