slightly OT: a simple Benchmark

Let's talk about anything related to the 6502 microprocessor.
User avatar
barrym95838
Posts: 2056
Joined: 30 Jun 2013
Location: Sacramento, CA, USA

Re: slightly OT: a simple Benchmark

Post by barrym95838 »

GaBuZoMeu wrote:

Code: Select all

APPLE II           Rom-Basic         BasBenc1            44,8    71,8  ____,_  ____,_  ____,_   ____,_
I think you might be quoting the Apple II+ times here. I got A=22.5, B=38.5, C=526 for a Woz BASIC Apple ][, using a brutally simple-minded translation of the VTL-2 version:

Code: Select all

  100 INPUT "RANGE, MIN.DIFF ",A,B
  120 Z=3
  130 C=1
  140 C=C+2
  150 D=1
  160 D=D+2
  170 IF C MOD D=0 THEN 210
  180 IF C>=D*D THEN 160
  190 IF C>=B+Z THEN 230
  200 Z=C
  210 IF A>=C THEN 140
  220 PRINT "NO SOLUTION": GOTO 260
  230 PRINT C;", ";Z
  260 PRINT : GOTO 100: END
If you don't use any additional features of Integer BASIC (like string handling, those pesky FOR/NEXT loops and those fancy signed integers) the times for Integer BASIC are consistently 18% slower than VTL02. :-)

Mike B.
User avatar
GaBuZoMeu
Posts: 660
Joined: 01 Mar 2017
Location: North-Germany

Re: slightly OT: a simple Benchmark

Post by GaBuZoMeu »

Isn't that funny? Running your VTL02 re-translation on EhBasic doesn't work. INPUT "blabla",A isn't accepted. And - ? - line 170 gives no error but doesn't work. EhBasic sadly doesn't know a modulo operator - at least I didn't found one. Using IF INT(C/D)*D=C THEN 210 works, but then it took roughly 18 s for the first range (on the Badge). I have no idea why this behaves so badly slow or why using FOR..NEXT and SQR() runs that much faster :?:
barrym95838 wrote:
Erm ... aren't you adding an extra iteration about 50% of the time with that +1? I can't think of any BASICs that re-calculate the limit on every iteration anyhow ...

Mike B.
I skipped that +1 but it doesn't change much - in fact with a wrist stopwatch I get the same numbers +/- 0.1 s - but still slightly slower than using SQR().
barrym95838 wrote:
I think you might be quoting the Apple II+ times here. I got A=22.5, B=38.5, C=526 for a Woz BASIC Apple ][, using a brutally simple-minded translation of the VTL-2 version
Mike B.
I only remember that it was a traditional Apple II (with that case with a lid that you can easily open to access the slots). I turned it on, hacked the program in, took my times, and leave. (In those times computer shop owners don't like computer "kids" playing on their machines :wink: )

If you use the "BasBenc1" version can you verify my (old) times?


Arne
User avatar
GaBuZoMeu
Posts: 660
Joined: 01 Mar 2017
Location: North-Germany

Re: slightly OT: a simple Benchmark

Post by GaBuZoMeu »

commodorejohn wrote:
GaBuZoMeu wrote:
commodorejohn wrote:
(Drat...even the POSIX time functions don't track to sub-second accuracy on NetBSD/vax 6.1.5, which means that all but the last one or two example values are effectively unmeasurable...)
:lol: :lol: :lol:
You may modify the input. Skip it and work with constants. Then run it thousand times or so. That should sum up to something detectable.
Good point :D Funnily enough, I also had to modify the long version to use long long because of the ambiguity with C type sizes (long and int are both 32-bit under GCC for VAX, apparently.) Here's the results (problems A-E run 1000x apiece, problem F run 100x):

Code: Select all

        A       B       C       D       E       F
int     2.48    4.17    53.61   139.26  262.22  7653.10
long*   13.24   22.61   312.63  826.16  1570.51 46712.30
* (long = long long)
MicroVAX 3100/90, NetBSD 6.1.5 - all times in milliseconds per run.
COOL !!
Thanks very much for these numbers!

That old lady is pretty nimble, isn't she? :D


Arne
User avatar
commodorejohn
Posts: 299
Joined: 21 Jan 2016
Location: Placerville, CA
Contact:

Re: slightly OT: a simple Benchmark

Post by commodorejohn »

Heh :D I was actually quite surprised by how responsive it was when I finally got it all properly set up recently - I tried OpenBSD on it once when I got it ages ago, and that was an absolute dog - I can only assume that their focus on security and encryption probably overtaxed the poor thing :| But it's quite usable on NetBSD, even over a 9600 bps console port (Telnetting into it is a vast improvement, though!)

Love to try it on an original 780, but I'm afraid I don't have one of those to hand :lol:

(The 3100/90 weighs in at about 24 VUPS, though, so I guess we can make a rough estimate from that...)

Edit: oh hahahaha I just remembered the Living Computer Museum has a pile of interesting mainframes and minis available to the general public...
Chromatix
Posts: 1462
Joined: 21 May 2018

Re: slightly OT: a simple Benchmark

Post by Chromatix »

Still debugging my 65c02 assembly version of the benchmark. Cases A-E actually run correctly already (within my own emulator), but case F produces the wrong result - of course, that's the only one that hits the 24-bit path, which is longer and more complex. Luckily my emulator is already set up to produce detailed traces, which I can pore through to figure this stuff out.
John West
Posts: 383
Joined: 03 Sep 2002

Re: slightly OT: a simple Benchmark

Post by John West »

GaBuZoMeu wrote:
Well, for all of you who are going to squeeze out every cycle and every bit of their machines: try to find the first gap that spans across 222. The numbers end with xxx969 and xxx747.
I cheated, and used a completely different algorithm. 3 seconds for my work PC. It finds the first gap of 320 in 31 seconds (...2549 to ...2869)

Can you find the first gap of 382 (...4659 to ...5041)?

This will be a good test for my 65020. I'll have a go at implementing both algorithms tonight. The simulator gives an approximate cycle count, so we can do hand-wavy comparisons.
User avatar
BillO
Posts: 1038
Joined: 12 Dec 2008
Location: Canada

Re: slightly OT: a simple Benchmark

Post by BillO »

A few more data points for you GaBuZoMeu. All use the BasBenc1 code:

Code: Select all

Tandy Coco 3 - 6809 @ 0.89MHz - Disk Extended Basic V1.1 - A=60.6, B=97.9, C=1090.1
  "    "   " -   "  " 1.78MHz -  "       "      "     "  - A=30.4, B=49.0, C=545.1

OSI 600 (Superboard) - 6502 @ 0.9825Mhz - OSI Basic V1.0 (MS) - A=25.3, B=52.1, C=642.4

Jaguar (homebrew) - W65C02 @ 16MHz - EhBasic V2.22 - A=1.46, B=2.42, C=31.40, D=82.29, E=156.06
Looks like my Jaguar project is the fastest FP BASIC machine - 8) Faster than some complied C and Pascal implementations too!
Bill
User avatar
GaBuZoMeu
Posts: 660
Joined: 01 Mar 2017
Location: North-Germany

Re: slightly OT: a simple Benchmark

Post by GaBuZoMeu »

THX for all your numbers. I will soon copy them into the table.

@BillO: nice little beast your Jaguar :)

@Chromatix: I never had enough drive to write an assembler version. In the beginning I simply haven't one, and later I seldom used a cross assembler for the 6502 as there were so much more powerful machines and I only had my old SYM :) but no fancy 4+ MHz machines.
User avatar
barrym95838
Posts: 2056
Joined: 30 Jun 2013
Location: Sacramento, CA, USA

Re: slightly OT: a simple Benchmark

Post by barrym95838 »

GaBuZoMeu wrote:
I only remember that it was a traditional Apple II (with that case with a lid that you can easily open to access the slots). I turned it on, hacked the program in, took my times, and leave. (In those times computer shop owners don't like computer "kids" playing on their machines :wink: )

If you use the "BasBenc1" version can you verify my (old) times?
I just confirmed A=44,8 B=71,8 (+/- 0,2) on a(n emulated) 1 MHz Apple ][+ running ROM Applesoft using a direct copy/paste of "BasBenc1". If you want, you can add C=657 to that row.

Mike B.
User avatar
GaBuZoMeu
Posts: 660
Joined: 01 Mar 2017
Location: North-Germany

Re: slightly OT: a simple Benchmark

Post by GaBuZoMeu »

Thank you Mike :D

657 s = 11 minutes - too much for staying unnoticed in a computer shop :D
User avatar
BillO
Posts: 1038
Joined: 12 Dec 2008
Location: Canada

Re: slightly OT: a simple Benchmark

Post by BillO »

barrym95838 wrote:
GaBuZoMeu wrote:
I only remember that it was a traditional Apple II (with that case with a lid that you can easily open to access the slots). I turned it on, hacked the program in, took my times, and leave. (In those times computer shop owners don't like computer "kids" playing on their machines :wink: )

If you use the "BasBenc1" version can you verify my (old) times?
I just confirmed A=44,8 B=71,8 (+/- 0,2) on a(n emulated) 1 MHz Apple ][+ running ROM Applesoft using a direct copy/paste of "BasBenc1". If you want, you can add C=657 to that row.

Mike B.
Just tried in on an genuine enhanced Apple //E - R65C02 @ 1.023MHz - A=44.6, B=71.6 - It has the same clock speed as the Apple II and II+

That pretty well duplicates your and Mike's results - I'll put down the 0.2 s improvement as being due to my being a musician and a dirt bike rider. :mrgreen:
Bill
User avatar
GaBuZoMeu
Posts: 660
Joined: 01 Mar 2017
Location: North-Germany

Re: slightly OT: a simple Benchmark

Post by GaBuZoMeu »

BillO wrote:
I'll put down the 0.2 s improvement as being due to my being a musician and a dirt bike rider. :mrgreen:
Hmm, what might count more? The musician or the rider ?? :P
User avatar
BillO
Posts: 1038
Joined: 12 Dec 2008
Location: Canada

Re: slightly OT: a simple Benchmark

Post by BillO »

GaBuZoMeu wrote:
BillO wrote:
I'll put down the 0.2 s improvement as being due to my being a musician and a dirt bike rider. :mrgreen:
Hmm, what might count more? The musician or the rider ?? :P
I think the rider - faster reflexes mean survival :shock:
Bill
John West
Posts: 383
Joined: 03 Sep 2002

Re: slightly OT: a simple Benchmark

Post by John West »

John West wrote:
This will be a good test for my 65020. I'll have a go at implementing both algorithms tonight. The simulator gives an approximate cycle count, so we can do hand-wavy comparisons.
That was fun. I've got a 65020 translation of the second Pascal version running. The source is below, although it isn't pretty.

The times are only estimates, and it's very possible it's counting them wrong. I've assumed that mul, div, and mod take the usual cycles to fetch opcode and operands, plus one cycle per bit (that doesn't sound unreasonable for mid-1980s technology). The C640 will probably end up running at 5MHz (which also doesn't sound unreasonable for an improved Commodore 64), and I've translated the cycle counts with that assumption

A: 194368 cycles = 39ms
B: 331858 cycles = 66ms
C: 4603325 cycles = 0.92s
D: 12183830 cycles = 2.44s
E: 23183225 cycles = 4.64s
F: 700264537 cycles = 140.05s

Code: Select all

Input file primes.asm:
          = 000003e8               1 range = 1000
          = 00000014               2 mindiff = 20
                                   3 ;range = 2000
                                   4 ;mindiff = 30
                                   5 ;range = 9999
                                   6 ;mindiff = 35
                                   7 ;range = 32000
                                   8 ;mindiff = 50
                                   9 ;range = 32000
                                  10 ;mindiff = 70
                                  11 ;range = 500000
                                  12 ;mindiff = 100
                                  13 
                                  14 	* = $c000
                                  15 
0000c000: 00a9 0015               16 	lda #$15
0000c002: 0085 d018               17 	sta $d018
0000c004: 00a9 0006               18 	lda #6
0000c006: 0085 d021               19 	sta $d021
                                  20 
0000c008: 00a9 0020               21 	lda #32
0000c00a: 20a9 000e               22 	lda a1, #14
0000c00c: 01a2 03e7               23 	ldx.w #999
0000c00e:                         24 clear
0000c00e: 0095 0400               25 	sta $0400,x
0000c010: 2095 d800               26 	sta a1, $d800, x
0000c012: 01ca                    27 	dex.w
0000c013: 0010 00f9               28 	bpl clear
                                  29 
                                  30 
0000c015: 02a5 df00               31 	lda.l $df00
0000c017: 0285 c04f               32 	sta.l startTime
                                  33 
          = 00000001              34 prim0 = 1
          = 00000002              35 incr = 2
                                  36 
0000c019: 02a2 0001 0000          37 	ldx.l x0, #prim0	; x0 = cnt
0000c01c: 22a2 0001 0000          38 	ldx.l x1, #prim0	; x1 = loprim
0000c01f: 42a2 0001 0000          39 	ldx.l x2, #prim0	; x2 = hiprim
                                  40 
0000c022:                         41 loop
0000c022: 22e0 00e8 0003          42 	cpx.l x1, #range
0000c025: 8010 002a               43 	bge loopEndFail
0000c027: 428a                    44 	mov.l a0, x2
0000c028: 96fc                    45 	sub.l a0, x1		; a0 = hiprim - loprim
0000c029: 02c9 0014 0000          46 	cmp.l a0, #mindiff
0000c02c: 8010 0009               47 	bge loopEndPass
                                  48 
0000c02e: 22e8                    49 	inx.l #incr, x0
0000c02f: 90f0 003b               50 	bra.l prim
0000c031: 00f0 00ef               51 	beq loop
0000c033: 5e9a                    52 	mov.l x1, x2
0000c034: 129a                    53 	mov.l x2, x0
0000c035: 80f0 00eb               54 	bra loop
                                  55 
0000c037:                         56 loopEndPass
0000c037: 01a0 040a               57 	ldy.w y0, #$0400 + 10
0000c039: 3a9a                    58 	mov.l x0, x1
0000c03a: 0020 0083 00c0          59 	jsr printNum
0000c03d: 01a0 0432               60 	ldy.w y0, #$0400 + 40 + 10
0000c03f: 5a9a                    61 	mov.l x0, x2
0000c040: 0020 0083 00c0          62 	jsr printNum
0000c043: 02a6 df00               63 	ldx.l x0, $df00
0000c045: 92eb 004f 00c0          64 	sbx.l x0, startTime
0000c048: 01a0 045a               65 	ldy.w y0, #$0400 + 80 + 10
0000c04a: 0020 0083 00c0          66 	jsr printNum
0000c04d: 80f0 001b               67 	bra done
                                  68 
0000c04f:                         69 startTime
0000c04f: 0000 0000               70 	.long 0
                                  71 
0000c051:                         72 loopEndFail
0000c051: 02a9 0058 00c0          73 	lda.l a0, #failMessage
0000c054: 90f0 0007               74 	bra.l print
0000c056: 80f0 0012               75 	bra done
                                  76 
0000c058:                         77 failMessage
0000c058: 0006 0001 0009 000c     78 	.byte 6, 1, 9, 12, 0	; "FAIL"
0000c05c: 0000                
                                  79 
0000c05d:                         80 print
0000c05d: 00a2 0000               81 	ldx #0
0000c05f:                         82 printLoop
0000c05f: 30b5 0000               83 	lda a1, 0, a0
0000c061: 00f0 0006               84 	beq printDone
0000c063: 2095 0400               85 	sta a1, $0400, x
0000c065: 00e8                    86 	inx
0000c066: 12e8                    87 	inx.l a0
0000c067: 80f0 00f6               88 	bra printLoop
0000c069:                         89 printDone
0000c069: 0060                    90 	rts
                                  91 
0000c06a:                         92 done
0000c06a: 80f0 00fe               93 	bra done
                                  94 
                                  95 	; input X0
                                  96 	; output A0
0000c06c:                         97 prim
0000c06c: 2048                    98 	pha a1
0000c06d: 20a9 0003               99 	lda a1, #3		; a1 = i
0000c06f:                        100 primLoop
0000c06f: 32a8                   101 	mov.l a0, a1
0000c070: 0283                   102 	mul.l a0, a0
0000c071: 12dc                   103 	cmp.l a0, x0
0000c072: 8010 000b              104 	bge primEndTrue
0000c074: 028a                   105 	mov.l a0, x0
0000c075: 06a3                   106 	mod.l a0, a1
0000c076: 00f0 0003              107 	beq primEndFalse
0000c078: 36e8                   108 	inx.l #2, a1
0000c079: 80f0 00f4              109 	bra primLoop
0000c07b:                        110 primEndFalse
0000c07b: 2068                   111 	pla a1
0000c07c: 00a9 0000              112 	lda #0
0000c07e: 0060                   113 	rts
0000c07f:                        114 primEndTrue
0000c07f: 2068                   115 	pla a1
0000c080: 00a9 0001              116 	lda #1
0000c082: 0060                   117 	rts
                                 118 
0000c083:                        119 printNum
0000c083: 068a                   120 	mov.l a1, x0
0000c084:                        121 printNumLoop
0000c084: 32a8                   122 	mov.l a0, a1
0000c085: 02a7 000a 0000         123 	mod.l a0, #10
0000c088: 8069 0030              124 	add a0, #48
0000c08a: 1085 0000              125 	sta 0, y0
0000c08c: 0188                   126 	dey.w y0
0000c08d: 2297 000a 0000         127 	div.l a1, #10
0000c090: 00d0 00f2              128 	bne printNumLoop
0000c092: 0060                   129 	rts
                                 130 
User avatar
barrym95838
Posts: 2056
Joined: 30 Jun 2013
Location: Sacramento, CA, USA

Re: slightly OT: a simple Benchmark

Post by barrym95838 »

GaBuZoMeu wrote:
Thank you Mike :D

657 s = 11 minutes - too much for staying unnoticed in a computer shop :D
...and also too long for me to pay attention to the clock. The number didn't look like it fit in with the rest, so I re-ran it and discovered that I lost two minutes somewhere. The "real" result is C=777 for Applesoft. Sorry about the error ...

Mike B.
Post Reply