orac wrote:
At 4Mhz an 8085 executes the shortest instruction in one usecond.
At 1Mhz (similar bus speed as 4Mhz 8085) a 6502 executes the
shortest instruction in two useconds.
How fortunate it is that the shortest operations performed in 6502 assembly language also happen to NOT be the most frequently executed. Nearly all 2-byte operations that do not involve auxiliary memory accesses (typically instructions which take an immediate value) ALSO execute in only two microseconds. Those that DO perform auxiliary bus transactions do so with (usually) one microsecond per byte transfered (e.g., a LDA Absolute instruction takes four cycles: three for the opcode and one for the byte fetched). Thus, most cycle times for the most commonly used 6502 instructions end up being SHORTER than their 8085/Z-80 equivalents. The *ONLY* time the Z-80 is faster is when executing the LDIR/LDDR instructions (block move). I don't know how the 65816's MVP/MVN instructions compete with LDIR, but I suspect they're on par, if not faster, due to the 65816's single-clock memory cycle.
All assuming a 1MHz clock, of course.
This is why an 8MHz 65816 is actually quite competitive with an 8MHz 68000, and utterly decimates the 8085/8086 at those speeds. It takes at least an 80286 before you start coming into the ballpark.
According to my programmers book on the 65816, the January, 1983 issue of BYTE magazine had a set of benchmarks posted for the 65816 in comparison to other processors: a 4MHz 65816 completed the seive in 1.56 seconds. A 5MHz 8088 took 4.0 seconds (!!), while an
8 MHz 8086 took 1.90 seconds.
That being said, note how the 65816 soundly kicks the 808x series butts. According to the article, it's true that the 65816 was twice as fast as the equivalent 6502 program. Thus, the 6502 at 4MHz (having taken roughly 3 seconds to complete) clearly defeated the 5MHz 8088 (having taken 4 seconds). Thus, an 8088 at 4MHz would have taken longer still. An 8-bit CPU defeated a 16-bit CPU! How do you explain this?
Looking at it a different way:
Code:
Estimated coefficient of "work" involved in completing a task:
68000 8MHz * 0.49 seconds = 3.92
65816 4MHz * 1.53 seconds = 6.12
6502 4MHz * 3.06 seconds = 12.24
8086 8MHz * 1.90 seconds = 15.20
8088 5MHz * 4.00 seconds = 20.00
Normalized performances:
68000 8MHz = 0.49 seconds
65816 8MHz = 0.73 seconds
6502 8MHz = 1.53 seconds
8086 8MHz = 1.90 seconds
8088 8MHz = 2.50 seconds
Looking at the numbers, it's clear that even the 4MHz 6502 is better than the 8MHz 8086.
The only processor it couldn't match was an 8MHz 68000, which completed its task in only 0.49 seconds. That being said, note that a relatively choked 8MHz 65816 would come awfully close to that 68000, completing the task in only 0.73 seconds.
Now before you argue that the 65816 is a totally different processor, let me remind you it's just a 16-bit extended 6502, complete with an 8-bit data bus. If the 65816 wasn't so choked by its data bus, and using even addresses for all 16-bit quantities stored in memory, it would equal or exceed the 68000 for most tasks.
The clocks-per-memory-cycle ratio is critical, and the closer to unity it is (and you really can't get much better than the 6502 unless you go full-bore RISC), the better. That, along with its pipelining and reduced gate counts, is why the 6502 is fast.