fachat wrote:
They also say that the asynchronous version of the 6502 used too many CLBs in the FPGA, so they could not "add translation drivers from conventional to flank-triggered dual-rail logic, and therefore no benchmark between synchronous and asynchronous variant is available" (my own crude translation).
Hogwash. Write a program. Load it onto a synchronous 6502 design. Observe the rate at which _SYNC is asserted per unit of wall-clock time. That is your CPU's instruction rate.
Do the same with the asynchronous implementation (I'm sure that, in the absence of _SYNC, they had some alternative signal they could have monitored), for the same amount of wall-time.
Divide one by the other to get a ratio of performance differences.
HOWEVER, . . .
Quote:
Also they write that "For reasons of [FPGA] space the much more interesting Z80 was out of the question".
Who would want to make an asynchronous Z80? If you're going to go through that kind of effort, you might as well go with an 8086, which is not too far from the Z80 capability-wise, and I claim is a more powerful processor anyway. (OTOH, the Z80 has the advantage of EX AF,AF' and EX R,R' instructions for super-fast interrupt response.)
Quote:
They say they have combined the small scale of the "level-controlled dual-rail" and the speed of the "flank-triggered dual-rail". Whatever that means in asynchronous logic.
My interpretation of this is that they combined the small-size permitted by a purely combinatorial circuit (which, by definition, is level-triggered; I haven't seen an edge-triggered NAND gate recently.
) with the high speeds permitted by dual (rising and falling) edge-triggered circuits. This is possible because you no longer have to worry about clock skew and distribution trees throughout the whole circuit.
Quote:
"But quickly the main problem of asynchronous CPUs showed: there are only much too short execution steps and too many accesses to external components (mostly RAM). Even though with the "DEP-Format" [obviously their technique] a technical way to increase RAM access speed (at the expense of memory efficiency [Speicherausnutzung]), a CPU (the size of a 6502) in asynchronous logic in a standard environment, can barely achieve speed increase. With the current processors with considerably more complex internal logic and longer computations between memory accesses the situation is surely different."
The problem I have with this conclusion is that it's wrong, on several levels.
(1) An asynchronous CPU, by definition, cannot run in "a standard environment," for the latter's performance is always dictated by the clock. You need an asynchronous means of accessing RAM, for starters (a quadrature interlocking protocol works great for this, and I claim is much simpler than the 68000-style AS/DTACK approach). Then, knowing that long lines to outside resources will incur a major performance hit, you MUST cache. In fact, even with a synchronous design, caching starts to look awfully appealing at CPU clock speeds in excess of 8MHz (which requires memory capable of keeping up with a 16MHz clock, since you're hitting it only during half a clock period). This is why most high-speed 65xx designs do NOT expose their buses through expansion ports.
(2) The conclusion exposes the author's complete misunderstanding of the purpose for using asynchronous logic. Even if you use an asynchronous CPU in an otherwise synchronous environment, you get the benefit of a chip which draws significantly less power (if implemented correctly; I'm not sure this would work in an FPGA environment), is much quieter in terms of RF hash, and which produces substantially less heat. These all can be extremely valuable qualities. For example, on an Intellasys SEAforth 24A chip that I have, I had all 24 cores running full blast at an estimated 650 MIPS each, and that chip only became two degrees warmer. ONLY two degrees! And, if my ham rig is any indication, it had no observable hash emissions; everything it put out was well below nature's own noise floor. Contrast this against my desktop PC, whose emissions are so strong I can hear it in my headphones when the sound card is silent.
(3) The conclusion leads the reader to directly believe that asynchronous logic is a worthless persuit. As you've translated it, and assuming I haven't had prior experience with asynchronous designs, I would never want to consider asynchronous logic ever again after reading it. Conclusions should recite the constraints of their conclusions, which is clearly not done here, with the exception of disclosing how they're interfacing to RAM, both of which appear to be synchronous based on context provided in your translation.
My conclusions could well be wrong, though, but since I don't read German, there's no way I can really respond from reading the primary source.