slightly OT: a simple Benchmark
Re: slightly OT: a simple Benchmark
Hmm. The factor of 1.33 could be explained if BeebEm is actually running the Second Processor at 3MHz instead of 4. The fifth line does look odd however. This is the first case that spends significant time in 24-bit mode (mainly because the 16/8 bit division routine doesn't *quite* work with >7-bit divisors).
I'm running it again on the BBC Master's main CPU, which should definitely be at 2MHz.
I'm running it again on the BBC Master's main CPU, which should definitely be at 2MHz.
Re: slightly OT: a simple Benchmark
That's fairly likely - the 3MHz hypothesis - as the original external 'cheese wedge' second processor was 3MHz. The later internally fitted Turbo second processor in the Master was (is) 4MHz.
I'm wondering - should I spin up my Matchbox second processor, which runs at 64MHz? It's an FPGA so maybe not felt to be relevant.
I'm wondering - should I spin up my Matchbox second processor, which runs at 64MHz? It's an FPGA so maybe not felt to be relevant.
Re: slightly OT: a simple Benchmark
That would be amusing if nothing else. I'm slightly more interested in the 16MHz "Jaguar".
Re: slightly OT: a simple Benchmark
Meanwhile I found a disassembly of the second 68K assembler version of this bench:
I remember that I hacked in this code with the one line assembler being part of the monitor. That is the reason for these NOPs here and there - I had to estimate the displacement
But as the result was so much faster than all I had tested so far, there was no need to tweak anything. And I remember it was amazing to me to translate an integer basic program on the fly into machine code. That was definitely a benefit of these CISC machine. 
Code: Select all
D2.W <== MinDiff then GO 3020
on exit: D0.W = HiPrim, D5.W = LoPrim !
--> this version verifies quotient and remainder
003000 7E03 MOVEQ.L #3,D7 ;counter 3..
003002 2200 MOVE.L D0,D1 ;HiPrim -> scr
003004 82C7 DIVU.W D7,D1 ;scr <- remainder/quotient
003006 4841 SWAP.W D1
003008 4A41 TST.W D1 ;remainder == 0 ?
00300A 670E BEQ $00301A
00300C 4841 SWAP.W D1
00300E B247 CMP.W D7,D1 ;quotient < counter
003010 650A BCS $00301C
003012 5487 ADDQ.L #2,D7 ;inc(counter,2)
003014 60EC BT $003002 ;loop
003016 4E71 NOP
003018 4E71 NOP
00301A 7E00 MOVEQ.L #0,D7 ;counter == 0 means noprim
00301C 4E75 RTS
00301E 4E71 NOP
003020 7003 MOVEQ.L #3,D0 ;prep Hiprim
003022 7803 MOVEQ.L #3,D4 ;prep Loprim
003024 61DA BSR $003000 ;Hiprim = prim?
003026 670A BEQ $003032 ;take branch if not
003028 3A04 MOVE.W D4,D5 ;save Loprim
00302A D842 ADD.W D2,D4 ;Loprim+MinDiff...
00302C B840 CMP.W D0,D4 ;... >= Hiprim ? ...
00302E 6306 BLS $003036 ;branch if true !
003030 3800 MOVE.W D0,D4 ;new Loprim
003032 5480 ADDQ.L #2,D0 ;next Hiprim
003034 60EE BT $003024 ;try again
003036 4E75 RTS
Re: slightly OT: a simple Benchmark
BigEd wrote:
I'm wondering - should I spin up my Matchbox second processor, which runs at 64MHz? It's an FPGA so maybe not felt to be relevant.
Re: slightly OT: a simple Benchmark
Chromatix wrote:
Hmm. The factor of 1.33 could be explained if BeebEm is actually running the Second Processor at 3MHz instead of 4. The fifth line does look odd however. This is the first case that spends significant time in 24-bit mode (mainly because the 16/8 bit division routine doesn't *quite* work with >7-bit divisors).
I'm running it again on the BBC Master's main CPU, which should definitely be at 2MHz.
I'm running it again on the BBC Master's main CPU, which should definitely be at 2MHz.
In the first moment I thought about the transition from 16/8 to 24/16 bit division. Could that be so significant? But then, why is the last number and the corresponding ratio well in line again? The transition should there even more prominent I assume.
If you could verify that again: fantastic!
Cheers,
Arne
Re: slightly OT: a simple Benchmark
(Hmm, BBC Micro launched at the end of 1981, the Atari ST 1985... the Beeb wasn't outdated at launch, especially with the 2MHz CPU which made it about the fastest 6502 machine available, IIRC. But you're right to mention the NS32k as that was one of the second processors eventually available for the Beeb, making it a 32 bit scientific workstation.)
Re: slightly OT: a simple Benchmark
Somehow the BBC marketing machine doesn't work in Germany. The only British producer that was "present" was Clive Sinclair with his machines (one was Z-80 based, don't recall the name, the second was the "QL" with its (in)famous tape drive!). Also Commodore was prominent but machines like the CoCo wasn't that "present" that they should have been. I am sure I had buy one (CoCo) if I had known it was available with OS-9 running on it.
Re: slightly OT: a simple Benchmark
The story with the BBC Micro was that the BBC wanted to promote computer literacy in the UK, and needed a standard, inexpensive home computer with a high-quality BASIC to do it - and by "high quality", they meant that none of the ubiquitous M$ derived BASICs would do. As a side-effect, they also wanted a machine that could overlay text and graphics on a broadcast TV signal with minimal added hardware.
Acorn at the time had the Atom, which was a reasonably powerful machine but came with a very simple OS and BASIC. They were however working on a new version, and convinced the BBC that they could do a high-quality BASIC to the BBC's specifications on a 6502, which remained the least expensive CPU with sufficient capabilities on the market. This freed up enough of the parts budget to include a huge amount of expandability, including built-in A/D converters (analogue joystick/paddle interface), sockets for expansion ROMS, a LAN interface (not Ethernet, think more like LocalTalk) and a floppy drive controller.
You might wish to compare the prices of the BBC Micro and the various contemporary Apple machines. The former was always triple digits of pounds sterling; the latter were often quadruple digits of US dollars, despite Woz' genius at minimising circuitry.
With the Atari ST arriving in 1985, it's worth noting that the first ARM CPU was built that year - as a BBC Micro Second Processor, of course, because it was originally an Acorn product! It took a couple more years to reach retail in the Archimedes, but now ARM is *the* most popular CPU in the world - you probably have at least half a dozen of them in your smartphone, and another half-dozen in various inconspicuous parts of your PC. Its original development was funded by the success of the BBC Micro.
But the BBC Master didn't arrive until 1986. Really, it was a stopgap until the Archimedes could be developed properly.
And now the benchmark results:
Acorn at the time had the Atom, which was a reasonably powerful machine but came with a very simple OS and BASIC. They were however working on a new version, and convinced the BBC that they could do a high-quality BASIC to the BBC's specifications on a 6502, which remained the least expensive CPU with sufficient capabilities on the market. This freed up enough of the parts budget to include a huge amount of expandability, including built-in A/D converters (analogue joystick/paddle interface), sockets for expansion ROMS, a LAN interface (not Ethernet, think more like LocalTalk) and a floppy drive controller.
You might wish to compare the prices of the BBC Micro and the various contemporary Apple machines. The former was always triple digits of pounds sterling; the latter were often quadruple digits of US dollars, despite Woz' genius at minimising circuitry.
With the Atari ST arriving in 1985, it's worth noting that the first ARM CPU was built that year - as a BBC Micro Second Processor, of course, because it was originally an Acorn product! It took a couple more years to reach retail in the Archimedes, but now ARM is *the* most popular CPU in the world - you probably have at least half a dozen of them in your smartphone, and another half-dozen in various inconspicuous parts of your PC. Its original development was funded by the success of the BBC Micro.
But the BBC Master didn't arrive until 1986. Really, it was a stopgap until the Archimedes could be developed properly.
And now the benchmark results:
Code: Select all
A: 0.18
B: 0.31
C: 5.32
D: 16.1
E: 37.05
F: 1898.95
Re: slightly OT: a simple Benchmark
Thank you!
Hmm, now we have a factor of 2.1 between the (old) cycle counts and the given times. Except for line 5 where the result is 2.52
That must be something else. Either the old count is wrong (typo?) or somewhat strange else is lurking there
Hmm, now we have a factor of 2.1 between the (old) cycle counts and the given times. Except for line 5 where the result is 2.52
That must be something else. Either the old count is wrong (typo?) or somewhat strange else is lurking there
Re: slightly OT: a simple Benchmark
B-Em is giving broadly consistent results with BeebEm, so the most likely answer is that the cycle counts from my emulator are wrong somehow. I'll have to look into that separately.
Re: slightly OT: a simple Benchmark
Then I assume the 4 MHz rating of the second µP (as stated in the table) needs to be corrected to 3 MHz. And I could enter another line with "Asm, ~Pascal V"" as a hint what algorithm was used?
Re: slightly OT: a simple Benchmark
The algorithm used is closest to the C code I used on my Raspberry Pi - that is, it not only avoids sqrt but also the multiply when updating the divisor limit. Since the 6502 doesn't have hardware multiply, this is a significant win, but it's a valid optimisation even on today's PCs.
Re: slightly OT: a simple Benchmark
It is just to have a somehow working comparison base. Pascal V2 uses "while ( (i*i < x) and (x mod i <> 0))". That seems to me roughly equivalent to your code. Or?
Re: slightly OT: a simple Benchmark
If we allow for compilers performing automatic strength reduction, then it's probably close enough.
B-Em allows setting the clock speed of the Second Processor explicitly, so now I have real numbers for 4MHz - and for 16MHz and 64MHz, too. It would still be interesting to compare with real 16MHz hardware.
B-Em allows setting the clock speed of the Second Processor explicitly, so now I have real numbers for 4MHz - and for 16MHz and 64MHz, too. It would still be interesting to compare with real 16MHz hardware.
Code: Select all
4MHz 16MHz 64MHz
A: .09 .02 .00
B: .15 .04 .01
C: 2.54 .64 .16
D: 7.70 1.92 .48
E: 17.53 4.38 1.10
F: 856.79 214.20 53.55