Page 3 of 14
Re: slightly OT: a simple Benchmark
Posted: Wed Jul 04, 2018 4:13 am
by barrym95838
Code: Select all
APPLE II Rom-Basic BasBenc1 44,8 71,8 ____,_ ____,_ ____,_ ____,_
I think you might be quoting the Apple II+ times here. I got A=22.5, B=38.5, C=526 for a Woz BASIC Apple ][, using a brutally simple-minded translation of the VTL-2 version:
Code: Select all
100 INPUT "RANGE, MIN.DIFF ",A,B
120 Z=3
130 C=1
140 C=C+2
150 D=1
160 D=D+2
170 IF C MOD D=0 THEN 210
180 IF C>=D*D THEN 160
190 IF C>=B+Z THEN 230
200 Z=C
210 IF A>=C THEN 140
220 PRINT "NO SOLUTION": GOTO 260
230 PRINT C;", ";Z
260 PRINT : GOTO 100: END
If you don't use any additional features of Integer BASIC (like string handling, those pesky FOR/NEXT loops and those fancy signed integers) the times for Integer BASIC are consistently 18% slower than VTL02.
Mike B.
Re: slightly OT: a simple Benchmark
Posted: Wed Jul 04, 2018 7:44 am
by GaBuZoMeu
Isn't that funny? Running your VTL02 re-translation on EhBasic doesn't work. INPUT "blabla",A isn't accepted. And - ? - line 170 gives
no error but doesn't work. EhBasic sadly doesn't know a modulo operator - at least I didn't found one. Using IF INT(C/D)*D=C THEN 210 works, but then it took roughly 18 s for the first range (on the Badge). I have no idea why this behaves so badly slow or why using FOR..NEXT and SQR() runs that much faster
Erm ... aren't you adding an extra iteration about 50% of the time with that +1? I can't think of any BASICs that re-calculate the limit on every iteration anyhow ...
Mike B.
I skipped that +1 but it doesn't change much - in fact with a wrist stopwatch I get the same numbers +/- 0.1 s - but still slightly slower than using SQR().
I think you might be quoting the Apple II+ times here. I got A=22.5, B=38.5, C=526 for a Woz BASIC Apple ][, using a brutally simple-minded translation of the VTL-2 version
Mike B.
I only remember that it was a traditional Apple II (with that case with a lid that you can easily open to access the slots). I turned it on, hacked the program in, took my times, and leave. (In those times computer shop owners don't like computer "kids" playing on their machines

)
If you use the "BasBenc1" version can you verify my (old) times?
Arne
Re: slightly OT: a simple Benchmark
Posted: Wed Jul 04, 2018 7:47 am
by GaBuZoMeu
(Drat...even the POSIX time functions don't track to sub-second accuracy on NetBSD/vax 6.1.5, which means that all but the last one or two example values are effectively unmeasurable...)
You may modify the input. Skip it and work with constants. Then run it thousand times or so. That should sum up to something detectable.
Good point

Funnily enough, I also had to modify the
long version to use
long long because of the ambiguity with C type sizes (
long and
int are both 32-bit under GCC for VAX, apparently.) Here's the results (problems A-E run 1000x apiece, problem F run 100x):
Code: Select all
A B C D E F
int 2.48 4.17 53.61 139.26 262.22 7653.10
long* 13.24 22.61 312.63 826.16 1570.51 46712.30
* (long = long long)
MicroVAX 3100/90, NetBSD 6.1.5 - all times in milliseconds per run.
COOL !!
Thanks very much for these numbers!
That old lady is pretty nimble, isn't she?
Arne
Re: slightly OT: a simple Benchmark
Posted: Wed Jul 04, 2018 8:09 am
by commodorejohn
Heh

I was actually quite surprised by how responsive it was when I finally got it all properly set up recently - I tried OpenBSD on it once when I got it ages ago, and that was an absolute dog - I can only assume that their focus on security and encryption probably overtaxed the poor thing

But it's quite usable on NetBSD, even over a 9600 bps console port (Telnetting into it is a vast improvement, though!)
Love to try it on an original 780, but I'm afraid I don't have one of those to hand
(The 3100/90 weighs in at about 24 VUPS, though, so I guess we can make a rough estimate from that...)
Edit: oh hahahaha I just remembered the Living Computer Museum has a pile of interesting mainframes and minis available to the general public...
Re: slightly OT: a simple Benchmark
Posted: Wed Jul 04, 2018 2:51 pm
by Chromatix
Still debugging my 65c02 assembly version of the benchmark. Cases A-E actually run correctly already (within my own emulator), but case F produces the wrong result - of course, that's the only one that hits the 24-bit path, which is longer and more complex. Luckily my emulator is already set up to produce detailed traces, which I can pore through to figure this stuff out.
Re: slightly OT: a simple Benchmark
Posted: Wed Jul 04, 2018 3:34 pm
by John West
Well, for all of you who are going to squeeze out every cycle and every bit of their machines: try to find the first gap that spans across 222. The numbers end with xxx969 and xxx747.
I cheated, and used a completely different algorithm. 3 seconds for my work PC. It finds the first gap of 320 in 31 seconds (...2549 to ...2869)
Can you find the first gap of 382 (...4659 to ...5041)?
This will be a good test for my 65020. I'll have a go at implementing both algorithms tonight. The simulator gives an approximate cycle count, so we can do hand-wavy comparisons.
Re: slightly OT: a simple Benchmark
Posted: Wed Jul 04, 2018 3:48 pm
by BillO
A few more data points for you GaBuZoMeu. All use the BasBenc1 code:
Code: Select all
Tandy Coco 3 - 6809 @ 0.89MHz - Disk Extended Basic V1.1 - A=60.6, B=97.9, C=1090.1
" " " - " " 1.78MHz - " " " " - A=30.4, B=49.0, C=545.1
OSI 600 (Superboard) - 6502 @ 0.9825Mhz - OSI Basic V1.0 (MS) - A=25.3, B=52.1, C=642.4
Jaguar (homebrew) - W65C02 @ 16MHz - EhBasic V2.22 - A=1.46, B=2.42, C=31.40, D=82.29, E=156.06
Looks like my Jaguar project is the fastest FP BASIC machine -

Faster than some complied C and Pascal implementations too!
Re: slightly OT: a simple Benchmark
Posted: Wed Jul 04, 2018 4:09 pm
by GaBuZoMeu
THX for all your numbers. I will soon copy them into the table.
@BillO: nice little beast your Jaguar
@Chromatix: I never had enough drive to write an assembler version. In the beginning I simply haven't one, and later I seldom used a cross assembler for the 6502 as there were so much more powerful machines and I only had my old SYM

but no fancy 4+ MHz machines.
Re: slightly OT: a simple Benchmark
Posted: Wed Jul 04, 2018 4:48 pm
by barrym95838
I only remember that it was a traditional Apple II (with that case with a lid that you can easily open to access the slots). I turned it on, hacked the program in, took my times, and leave. (In those times computer shop owners don't like computer "kids" playing on their machines

)
If you use the "BasBenc1" version can you verify my (old) times?
I just confirmed A=44,8 B=71,8 (+/- 0,2) on a(n emulated) 1 MHz Apple ][+ running ROM Applesoft using a direct copy/paste of "BasBenc1". If you want, you can add C=657 to that row.
Mike B.
Re: slightly OT: a simple Benchmark
Posted: Wed Jul 04, 2018 5:14 pm
by GaBuZoMeu
Thank you Mike
657 s = 11 minutes - too much for staying unnoticed in a computer shop

Re: slightly OT: a simple Benchmark
Posted: Wed Jul 04, 2018 5:40 pm
by BillO
I only remember that it was a traditional Apple II (with that case with a lid that you can easily open to access the slots). I turned it on, hacked the program in, took my times, and leave. (In those times computer shop owners don't like computer "kids" playing on their machines

)
If you use the "BasBenc1" version can you verify my (old) times?
I just confirmed A=44,8 B=71,8 (+/- 0,2) on a(n emulated) 1 MHz Apple ][+ running ROM Applesoft using a direct copy/paste of "BasBenc1". If you want, you can add C=657 to that row.
Mike B.
Just tried in on an genuine enhanced Apple //E - R65C02 @ 1.023MHz - A=44.6, B=71.6 - It has the same clock speed as the Apple II and II+
That pretty well duplicates your and Mike's results - I'll put down the 0.2 s improvement as being due to my being a musician and a dirt bike rider.

Re: slightly OT: a simple Benchmark
Posted: Wed Jul 04, 2018 5:46 pm
by GaBuZoMeu
I'll put down the 0.2 s improvement as being due to my being a musician and a dirt bike rider.

Hmm, what might count more? The musician or the rider ??

Re: slightly OT: a simple Benchmark
Posted: Wed Jul 04, 2018 6:11 pm
by BillO
I'll put down the 0.2 s improvement as being due to my being a musician and a dirt bike rider.

Hmm, what might count more? The musician or the rider ??

I think the rider - faster reflexes mean survival

Re: slightly OT: a simple Benchmark
Posted: Wed Jul 04, 2018 8:23 pm
by John West
This will be a good test for my 65020. I'll have a go at implementing both algorithms tonight. The simulator gives an approximate cycle count, so we can do hand-wavy comparisons.
That was fun. I've got a 65020 translation of the second Pascal version running. The source is below, although it isn't pretty.
The times are only estimates, and it's very possible it's counting them wrong. I've assumed that mul, div, and mod take the usual cycles to fetch opcode and operands, plus one cycle per bit (that doesn't sound unreasonable for mid-1980s technology). The C640 will probably end up running at 5MHz (which also doesn't sound unreasonable for an improved Commodore 64), and I've translated the cycle counts with that assumption
A: 194368 cycles = 39ms
B: 331858 cycles = 66ms
C: 4603325 cycles = 0.92s
D: 12183830 cycles = 2.44s
E: 23183225 cycles = 4.64s
F: 700264537 cycles = 140.05s
Code: Select all
Input file primes.asm:
= 000003e8 1 range = 1000
= 00000014 2 mindiff = 20
3 ;range = 2000
4 ;mindiff = 30
5 ;range = 9999
6 ;mindiff = 35
7 ;range = 32000
8 ;mindiff = 50
9 ;range = 32000
10 ;mindiff = 70
11 ;range = 500000
12 ;mindiff = 100
13
14 * = $c000
15
0000c000: 00a9 0015 16 lda #$15
0000c002: 0085 d018 17 sta $d018
0000c004: 00a9 0006 18 lda #6
0000c006: 0085 d021 19 sta $d021
20
0000c008: 00a9 0020 21 lda #32
0000c00a: 20a9 000e 22 lda a1, #14
0000c00c: 01a2 03e7 23 ldx.w #999
0000c00e: 24 clear
0000c00e: 0095 0400 25 sta $0400,x
0000c010: 2095 d800 26 sta a1, $d800, x
0000c012: 01ca 27 dex.w
0000c013: 0010 00f9 28 bpl clear
29
30
0000c015: 02a5 df00 31 lda.l $df00
0000c017: 0285 c04f 32 sta.l startTime
33
= 00000001 34 prim0 = 1
= 00000002 35 incr = 2
36
0000c019: 02a2 0001 0000 37 ldx.l x0, #prim0 ; x0 = cnt
0000c01c: 22a2 0001 0000 38 ldx.l x1, #prim0 ; x1 = loprim
0000c01f: 42a2 0001 0000 39 ldx.l x2, #prim0 ; x2 = hiprim
40
0000c022: 41 loop
0000c022: 22e0 00e8 0003 42 cpx.l x1, #range
0000c025: 8010 002a 43 bge loopEndFail
0000c027: 428a 44 mov.l a0, x2
0000c028: 96fc 45 sub.l a0, x1 ; a0 = hiprim - loprim
0000c029: 02c9 0014 0000 46 cmp.l a0, #mindiff
0000c02c: 8010 0009 47 bge loopEndPass
48
0000c02e: 22e8 49 inx.l #incr, x0
0000c02f: 90f0 003b 50 bra.l prim
0000c031: 00f0 00ef 51 beq loop
0000c033: 5e9a 52 mov.l x1, x2
0000c034: 129a 53 mov.l x2, x0
0000c035: 80f0 00eb 54 bra loop
55
0000c037: 56 loopEndPass
0000c037: 01a0 040a 57 ldy.w y0, #$0400 + 10
0000c039: 3a9a 58 mov.l x0, x1
0000c03a: 0020 0083 00c0 59 jsr printNum
0000c03d: 01a0 0432 60 ldy.w y0, #$0400 + 40 + 10
0000c03f: 5a9a 61 mov.l x0, x2
0000c040: 0020 0083 00c0 62 jsr printNum
0000c043: 02a6 df00 63 ldx.l x0, $df00
0000c045: 92eb 004f 00c0 64 sbx.l x0, startTime
0000c048: 01a0 045a 65 ldy.w y0, #$0400 + 80 + 10
0000c04a: 0020 0083 00c0 66 jsr printNum
0000c04d: 80f0 001b 67 bra done
68
0000c04f: 69 startTime
0000c04f: 0000 0000 70 .long 0
71
0000c051: 72 loopEndFail
0000c051: 02a9 0058 00c0 73 lda.l a0, #failMessage
0000c054: 90f0 0007 74 bra.l print
0000c056: 80f0 0012 75 bra done
76
0000c058: 77 failMessage
0000c058: 0006 0001 0009 000c 78 .byte 6, 1, 9, 12, 0 ; "FAIL"
0000c05c: 0000
79
0000c05d: 80 print
0000c05d: 00a2 0000 81 ldx #0
0000c05f: 82 printLoop
0000c05f: 30b5 0000 83 lda a1, 0, a0
0000c061: 00f0 0006 84 beq printDone
0000c063: 2095 0400 85 sta a1, $0400, x
0000c065: 00e8 86 inx
0000c066: 12e8 87 inx.l a0
0000c067: 80f0 00f6 88 bra printLoop
0000c069: 89 printDone
0000c069: 0060 90 rts
91
0000c06a: 92 done
0000c06a: 80f0 00fe 93 bra done
94
95 ; input X0
96 ; output A0
0000c06c: 97 prim
0000c06c: 2048 98 pha a1
0000c06d: 20a9 0003 99 lda a1, #3 ; a1 = i
0000c06f: 100 primLoop
0000c06f: 32a8 101 mov.l a0, a1
0000c070: 0283 102 mul.l a0, a0
0000c071: 12dc 103 cmp.l a0, x0
0000c072: 8010 000b 104 bge primEndTrue
0000c074: 028a 105 mov.l a0, x0
0000c075: 06a3 106 mod.l a0, a1
0000c076: 00f0 0003 107 beq primEndFalse
0000c078: 36e8 108 inx.l #2, a1
0000c079: 80f0 00f4 109 bra primLoop
0000c07b: 110 primEndFalse
0000c07b: 2068 111 pla a1
0000c07c: 00a9 0000 112 lda #0
0000c07e: 0060 113 rts
0000c07f: 114 primEndTrue
0000c07f: 2068 115 pla a1
0000c080: 00a9 0001 116 lda #1
0000c082: 0060 117 rts
118
0000c083: 119 printNum
0000c083: 068a 120 mov.l a1, x0
0000c084: 121 printNumLoop
0000c084: 32a8 122 mov.l a0, a1
0000c085: 02a7 000a 0000 123 mod.l a0, #10
0000c088: 8069 0030 124 add a0, #48
0000c08a: 1085 0000 125 sta 0, y0
0000c08c: 0188 126 dey.w y0
0000c08d: 2297 000a 0000 127 div.l a1, #10
0000c090: 00d0 00f2 128 bne printNumLoop
0000c092: 0060 129 rts
130
Re: slightly OT: a simple Benchmark
Posted: Wed Jul 04, 2018 8:46 pm
by barrym95838
Thank you Mike
657 s = 11 minutes - too much for staying unnoticed in a computer shop

...and also too long for me to pay attention to the clock. The number didn't look like it fit in with the rest, so I re-ran it and discovered that I lost two minutes somewhere. The "real" result is C=777 for Applesoft. Sorry about the error ...
Mike B.