BigEd wrote:
[...] it's worth replicating the dispatch logic. To have every opcode snippet jump back, only in order to jump forward again, is unnecessary.
I'll elaborate if I may, Ed. The jump back to a single copy of the dispatcher is not only unnecessary; it also sacrifices an advantage that's enjoyed in the contrasting situation -- ie, when each opcode snippet concludes by falling through to its own private copy of the dispatcher. This was an eye-opener for me.
Quote:
The nice thing about handler chaining is that it has a beneficial side-effect! Not only does it eliminate a jump back to the top of a loop, by spreading out the indirect jumps from one central point and into each of the handlers the host CPU how has dozens if not hundreds of places that is it dispatch from. You might say to yourself this is bad, I mean, this bloats the size of the interpreter's code and puts an extra strain on the host CPU's branch predictor, no?
Yes! But, here is the catch. Machine language opcodes tend to follow patterns. Stack pushes are usually followed by a call instruction. Pops are usually followed by a return instruction. A memory load instruction is usually followed by a memory store instruction. A compare is followed by a conditional jump (usually a Jump If Zero). Especially with compiled code, you will see patterns of instructions repeating over and over again. That means that if you are executing the handler for the compare instruction, chances are very good that they next guest instruction is a conditional jump. Patterns like this will no doubt make up a huge portion of the guest code being interpreted, and so what happens is that the host CPU's branch predictor will start to correctly predict the jump targets from one handler to another.
_________________
In 1988 my 65C02 got six new registers and 44 new full-speed instructions!
https://laughtonelectronics.com/Arcana/ ... mmary.html