slightly OT: a simple Benchmark

Let's talk about anything related to the 6502 microprocessor.
Post Reply
User avatar
GaBuZoMeu
Posts: 660
Joined: 01 Mar 2017
Location: North-Germany

slightly OT: a simple Benchmark

Post by GaBuZoMeu »

A question about execution speed comparisons viewtopic.php?f=5&p=60986#p60969 causes me to present a table where I have noted the execution speed of various computers running a simple program. In order not to stress the original topic I moved to this place.

Code: Select all

                                    B A S I C - B E N C H M A R K  (from BASBENCH.TAB, 28.08.93)
                                    =============================

┌───────────────────┬──────────────────────────────╥─────────────────────┬────────────────────────────┐
│		EINGABE      │          ERGEBNISSE          ║          EINGABE    │          ERGEBNISSE        │
│A:    1000 ,  20   │    907  ,      887  ,    20  ║  D:   32000 ,  50   │  19661  ,    19609  ,    52│
│B:    2000 ,  30   │   1361  ,     1327  ,    34  ║  E:   32000 ,  70   │  31469  ,    31397  ,    72│
│C:    9999 ,  35   │   9587  ,     9551  ,    36  ║  F:  500000 , 100   │ 370373  ,   370261  ,   112│
└───────────────────┴──────────────────────────────╨─────────────────────┴────────────────────────────┘
Ausführungszeiten verschiedener Rechner/Programme (vgl. ggf. Listings, alle Zeitangaben in Sekunden) :
──────────────────────────────────────────────────────────────────────────────────────────────────────
Rechner/CPU-Typ    Progr.sprache     Listing               A       B       C       D       E        F
-----------------  ----------------  -----------------  -----  ------  ------  ------  ------   ------
SYM (6502,1MHz)    Basic1.1          BasBenc1            46,7    75,1   812,7  2014,0  3696,1   ____,_
CBM 3032 (-"-)     Rom-Basic         BasBenc1            48,0    80.0  ____,_  ____,_  ____,_   ____,_
Badge (6502, 2MHz) VTL02C		 	  ~Basic,UINT,MOD      9,6	 16,3   222,8   590,0  1126,6
       "           EhBasic V2.22	  BasBenc1      		11,6	 19,3	252,6	662,4  1257,3
∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙
APPLE II           Rom-Basic         BasBenc1            44,8    71,8   777,0  ____,_  ____,_   ____,_
       "           Woz Basic         ~VTL02              22,5    38,5   526,0
Apple //e, 128KB   Plasma2.0 (JIT)   ~C(V2) (int)         5,72    9,3   118,36 
∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙
Atari 800XL        FastBasic_3.5     ~C(V2)(modulo)       5,3     9,0   123,3   326,1   620,4           THX 2 dsmc (post#63001)
∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙
C64 (6502, 1MHz)   Basic             BasBenc1            48,0   118,0  1412,0  3516,0 10447,0  45819,0  THX 2 Jeff_Birt
       "           DurexForth        very close to Basic 18,0    30,0   374,0  1530,0  2352,0           see post #62871
       "           DurexForth        -"- + sqrt2         18,0    26,0   334,0   861,0  1615,0           see post #63055
       "           DurexForth        -"- + sqrt2 in asm  12,0    19,0   276,0   740,0  1416,0           
       "           DurexForth        ~C(V2)(modulo..)    11,0    18,0   242,0   637,0  1209,0           
∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙
65C02 homebrew     dflat  Basic       int, modulo         3,95    6,51   81,61                          - 65C02 @ 5,36MHz; THX 2 dolomiah
65C02 pocket       EhBasic 2.22p4C    BasBenc1            2,78    4,68   61,92  162,51  308,33          - 65C02 @ 8Mhz; THX 2 floobydust
∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙
OSI 600(Superbrd.) OSI Basic V1.0 (MS)   "               25,3    52,1   642,4                           - 6502 @ 0.9825Mhz
Jaguar (homebrew)  EhBasic V2.22         "                1,46    2,42   31,40   82,29  156,06          - W65C02 @ 16MHz
∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙
BBC (65C02, 4MHz)  HiBASIC (IV)      ~BasBenc1			   5,87    9,96  137,27  363,96  694,67
       "           running on 2.µP   +% cached SQR()      3,56    5,96   78,7   207,93  397,85
B-Em (6502, 4MHz)  Assembler         ~Pascal V2           0,09    0,15    2,54    7,70   17,53   856,79 ~ var.len.div., THX 2 Chromatix
∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙
65020 (5MHz)       Assembler         ~Pascal V2           0,039   0,066   0,92    2,44    4,64   140,05
∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙
MC68000 (4MHz)     Assembler         Assembler V1.0 int   0,x     0,x     3,9     9,9    18,5   ____,_
       "                  "          Assembler V1.1 int   0,x     0,x     3,0     7,6    14,1   ____,_
∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙
TRS-80 M 1 (Z80)   Level2-Basic      BasBenc1            68,7   112,1  ____,_  ____,_  ____,_   ____,_
∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙
ExorSet 163        Basic09           ≈ Pascal V1.0       10,5    16,5   183,8  ____,_  ____,_   ____,_
CoCo3 (0,89MHz)    Disk Ext.Basic1.1 BasBenc1            60,6    97,9  1090,1
  "   (1.78Mhz)         "                "               30,4    49,0   545,1
∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙
EurocomII (6809)   Basic V3.1        BasBenc1            18,3    30,6   403,2  ____,_  ____,_   ____,_
E II V7 (1.34MHz)  Extended Basic    BasBenc1            55,6    93,4  ____,_  ____,_  ____,_   ____,_
       "           ( TSC GBAS )      Integervariable     42,1    69,8  ____,_  ____,_  ____,_   ____,_
       "           OmegaSoft PASCAL  s. Listing V1.0      5,2     8,2    81,2   196,8   352,0   ____,_
       "           Compiler V 1.10   s. Listing V2.0      3,4     5,5    67,5   173,2   323,1   ____,_
       "           Windrush C Comp.  V2.0 (int)           2,4     4,1    51,0   131,8   245,8   ____,_
       "           noStkchk+optimize V2.0 (long)         10,8    18,3   249.5   654,6  1236,1   ____,_
∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙
Amiga 2000 (68K)   AmigaBasic V      BasBenc1, ShortFP    9,3    15,3   190,5  ____,_  ____,_   ____,_
       "                  "          BasBenc1, Integer    8,0    13,0   158,1  ____,_  ____,_   ____,_
       "           Lattice-C V3.10   C-Listing V1 ieee    6,4    11,0   152,7   401,8   784,0   ____,_
       "                  "                       ffp     1,8     2,8    35,7    94,4   179,1   5740,0
       "                  "          C-Listing V2 long    0,x     0,x     6,6    16,8    31,7   1094,2
       "           AM-Modula-2 V3.1  Modula2-Vers. 1.0    1,2     1,8    12,5    27,0  ____,_   ____,_
       "                  "          Modula2-Vers. 1.1    1,3     1,9    13,7    30,2    52,3   1013,1
       "                  "          Modula2-Vers. 2.0    0,x     0,x     8,2    20,8    39,3   1129,2
∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙
PC-XT              TurboPascal 3.02  Pascal V1.0 (16Bit)  2,4     3,5    25,6    51,9  ____,_   ____,_
(8088,10 MHz)             "          Pascal V2.0 (16Bit)  0,x     0,x     4,2    10,5  ____,_   ____,_
       "           TurboPascal 4.0   Pascal V1.0 (16Bit)  2,0     3,0    20,6    41,7  ____,_   ____,_
(V20 , 10 MHz)            "                   "           1,9     2,7    19,1    38,2    65,2   ____,_
(8088, 10 MHz)            "          Pascal V2.0 (16Bit)  0,x     0,x     3,4     8,3  ____,_   ____,_
(V20 , 10 MHz)            "                   "           0,x     0,x     2,1     5,0     9,1   ____,_
(8088, 10 MHz)            "          Pascal V2.0 (32Bit)  0,x     0,x  ____,_    68,7  ____,_   ____,_
(V20 , 10 MHz)            "                   "           0,x     0,x    23,9    63,0   119,9   ____,_
∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙
(8088, 10 MHz)     Turbo-C V1.5      C-Listing V1.0       5,8     9,9   137,2  ____,_  ____,_   ____,_
(V20 , 10 MHz)            "                   "           5,0     8,6   118,7  ____,_  ____,_   ____,_
(8088, 10 MHz)            "          C-Listing V2.0 long  0,x     0,x     8,5    21,9  ____,_   ____,_
(V20 , 10 MHz)            "                   "           0,x     0,x     7,0    18,0    33,8   3602,0
       "                  "          C-Listing V2.0 int   0,x     0,x     1,2     2,8     5,1   ____,_
(Inboard-PC,16MHz)        "          C-Listing V2.0 long                                  9,7    976,2
(      "         )        "          ---"--- + 286-Option                                 9,4   1019,4 !!
(      "         )        "          C V3.0 long, ohne 286                                9,2    982,8
(      "         )        "          ---"---, mit 286,wordaligned                         9,3    986,8
∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙
(8088, 10 MHz)     GW-Basic V3.22    BasBenc1, ShortFP    9,3    15,3   195,7  ____,_  ____,_   ____,_
(V20 , 10 MHz)            "                   "           8,7    14,4   183,8  ____,_  ____,_   ____,_
(8088, 10 MHz)     GW-Basic V3.22    BasBenc1, Integer    7,7    12,6   156,5  ____,_  ____,_   ____,_
(V20 , 10 MHz)            "                   "           7,3    12,0   148,4  ____,_  ____,_   ____,_
(V20 , 10 MHz)            "          40 C% mod D% = 0 ?   6,1     9,7   115,7  ____,_  ____,_   ____,_
∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙
80552 (11.06MHz)   MI-C8051 V-5x218  C-Listing V2.0 int   2,1     3,5    46,9   125,3   239,6   ____,_
       "                  "          C-Listing V2.0 long  4,8     8,1   113,8   304,4   583,1  18210,0
∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙
T800 (20MHz)       Occam V.0.88      Occam V.1.0 (32Bit)  0,x     0,x     0,3     0,8     1,5     45,5
∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙
386DX40 128K Cache Quick C 1.0		 BasBenc2.C (long)	 all optimizations for speed     3,1	  84,9
  + CoPro 				  " 		     BasBenc1.C (float)									        8,2	 240,6
∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙
∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙
RasPi 2B (0,9GHz)  Python 3.4.2		prime2.py 			  0,01    0,02    0,21    0,53    0,93    28,48
       "           pypy3-2.4.0            "                                               0,14     1,45
	    "           C-Cortex-A7       (Chromatix)                          0,002   0,004   0,007    0,183
	    "           gcc                    "                               0,005   0,013   0,025    0,804
∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙
Arduino Nano V3    Sketch V1.8.5	  ~C(V2) int			  0,032   0,055	0,80    2,15    4,11
(16MHz ATmega328)					 	 ~C(V2) long			 0,094   0,163   2,33    6,38   11,91   366,4
∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙
∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙
PC,i7-8700,3.2GHz  LuaJIT            ~PASCAL(V2)          0,005   0,010   0,2     0,5     0,9     26,0 <= ms !!!
∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙
MicroVAX 3100/90   GCC for VAX       C(V2) (int32)        2,48    4,17   53,61  139,26  262,22  7653,1 <= ms !!!
       "                             C(V2) (int64)       13,24   22,61  312,63  826,16 1570,51 46712,3 <= ms !!!
∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙



The origin of the table starts back around 1980/81 when I played on a SYM-1 with 4K RAM and Synertek-BASIC (=Microsoft). The benchmark program simply calculates prime number gaps: "BasBenc1"

Code: Select all

  1 REM
  2 REM         B A S I C - B E N C H
  3 REM
  4 REM Das Programm testet, ob es im Bereich [3..A] der natürlichen
  5 REM Zahlen zwei aufeinander folgende Primzahlen P1 < P2 gibt, deren
  6 REM Differenz größer oder gleich B ist ( P2 - P1 >= B ).
  7 REM
  8 REM
 10 ZS = 3 : INPUT A,B
 20 FOR C = 3 TO A STEP 2
 30 FOR D = 3 TO SQR(C) STEP 2
 40 IF INT(C/D)*D = C THEN 80
 50 NEXT D
 60 IF C-ZS >= B THEN PRINT C,ZS,C-ZS : GOTO 10
 70 ZS = C
 80 NEXT C
 90 PRINT " KEINE LOESUNG " : GOTO 10
I tried to gain more speed using the ability of MS-Basic to deal with integers instead of floats:

Code: Select all

1  REM Basic-Bench á la SYM etc.
2  REM **** Integerversion ****
10 ZS% = 3: INPUT A%,B%
20 FOR C% = 3 TO A% STEP 2
30 FOR D% = 3 TO SQR(C%) STEP 2
40 IF (C%\D%)*D% = C% THEN 80
50 NEXT D%
60 IF C%- ZS% >= B% THEN PRINT C%,ZS%,C%-ZS%: GOTO 10
70 ZS% = C%
80 NEXT C%
90 PRINT "keine Lösung gefunden !": GOTO 10
Using modulo instead of (C%\D%)*D% was the last step:

Code: Select all

1  REM Basic-Bench á la SYM etc.
2  REM *** Integerversion + Modulofunktion ***
10 ZS% = 3: INPUT A%,B%
20 FOR C% = 3 TO A% STEP 2
30 FOR D% = 3 TO SQR(C%) STEP 2
40 IF C% MOD D% = 0 THEN 80
50 NEXT D%
60 IF C%- ZS% >= B% THEN PRINT C%,ZS%,C%-ZS%: GOTO 10
70 ZS% = C%
80 NEXT C%
90 PRINT "keine Lösung gefunden !": GOTO 10
On a 6809 system running FLEX-9 I could try more languages: Basic, Pascal, and C. The first Pascal attempt was a straight translation:

Code: Select all

program BENCH1( INPUT,OUTPUT );

{ Programm ist Pascal-Version vom Basic-Bench.
  Es testet ob es im Bereich 3 bis A zwei aufeinanderfolgende
  Primzahlen (p1,p2) gibt, so das gilt : p2-p1 >= B }

const prim0 = 3; incr = 2;
var range,mindiff,prim1,prim2,cnt1,cnt2,limit1 : integer;
    flag1 : boolean;

begin { main }

range := 0; mindiff := 0; prim1 := prim0; prim2 := prim1; cnt1 := prim0;

while range <= prim0 do begin
  writeln; write(' Gebe obere Grenze und minimale Differenz ein : ');
  readln( range,mindiff ); writeln
  end;

while (cnt1 <= range) and (prim2-prim1 < mindiff) do begin
  limit1 := round(sqrt(cnt1));
  flag1 := true; cnt2 := prim0;
  while (cnt2 <= limit1) and flag1 do begin
    flag1 := not( cnt1 mod cnt2 = 0 );
    cnt2 := cnt2 + incr
    end;
  if flag1 then begin
    prim1 := prim2 ; prim2 := cnt1
    end;
  cnt1 := cnt1 +incr
  end;

if prim2-prim1 >= mindiff
  then writeln(' Ergebnis : ',prim2:8,prim1:8,prim2-prim1:8)
  else writeln(' keine Loesung gefunden ')

end.
Then I discovered a method to get around using SQRT():

Code: Select all

program BENCH2( INPUT,OUTPUT );

{ Programm ist Pascal-Version vom Basic-Bench. 
  Es testet ob es im Bereich 3 bis A zwei aufeinanderfolgende
  Primzahlen (p1,p2) gibt, so das gilt : p2-p1 >= B .
  Im Gegensatz zur Version BENCH1, ist diese pascaltypisch geschrieben. }

const prim0 = 1; incr = 2;
var range,mindiff,cnt,hiprim,loprim : integer;

function prim(x : integer): boolean;
const step =2; init = 3;
var i : integer;
begin
  i := init;
  while (( i*i < x ) and ( x mod i <> 0 )) do
    i := i + step;
  prim :=  x < i*i ;
end; { of prim }

begin { main }

  writeln; write(' Gebe obere Grenze und minimale Differenz ein : ');
  readln( range,mindiff ); writeln;

  loprim := prim0;
  hiprim := prim0;
  cnt := prim0;

  while (cnt < range) and (hiprim-loprim < mindiff) do begin
    cnt := cnt + incr;
    if prim(cnt) then begin
      loprim := hiprim;
      hiprim := cnt
    end;
  end;

  writeln;
  if hiprim - loprim >= mindiff
  then writeln(' Ergebnis : ',hiprim:8,loprim:8,hiprim-loprim:8)
  else writeln(' keine Loesung gefunden ')

end.
The C code using floats is lost, but it should correspond to the first Pascal version. The V2 variant using integers:

Code: Select all

#include <stdio.h>
/* #include <math.h> */

int getnum()
{ char s[80];
  gets(s);
  return(atoi(s));
}

int prim(x) /* if x is prime return 1 else 0 */
int x;
{ int i = 3;
  while ((i*i < x) && (x % i) != 0)
    i+=2;
  if (x < i*i) return(1);
  else         return(0);
}

void main()
{ int a,b,i,loprim,hiprim;
  printf("\nGeben Sie den oberen Grenzwert ein :");
  a = getnum();
  printf("\nGeben Sie die minimale Differenz ein :");
  b = getnum();
  loprim = hiprim = 1;
  for (i=3; i <= a && (hiprim-loprim < b); i+=2)
  { if (prim(i))
    { loprim = hiprim; hiprim = i;
    }
  };
  if (hiprim-loprim < b)
    printf("keine Loesung gefunden !\n");
  else
    printf("%12d %12d \n",hiprim,loprim);
}
Using longs:

Code: Select all

#include <stdio.h>
#include <ctype.h>
/* #include <math.h> */

long getnum()
{ char c;
  long res=0;
  while(isdigit(c=getchar()))
    res = res*10+(c & 15);
  return(res);
}

int prim(x) /* if x is prime return 1 else 0 */
long x;
{ long i = 3;
  while ((i*i < x) && (x % i) != 0)
    i+=2;
  if (x < i*i) return(1);
  else         return(0);
}

void main()
{ long a,b,i,loprim,hiprim;
  printf("\nGeben Sie den oberen Grenzwert ein :");
  a = getnum();
  printf("\nGeben Sie die minimale Differenz ein :");
  b = getnum();
  loprim = hiprim = 1;
  for (i=3; i <= a && (hiprim-loprim < b); i+=2)
  { if (prim(i))
    { loprim = hiprim; hiprim = i;
    }
  };
  if (hiprim-loprim < b)
    printf("keine Loesung gefunden !\n");
  else
    printf("%12ld %12ld \n",hiprim,loprim);
}
The latest written version was in VTL02C (thanks to Mike B. and Klaus):

Code: Select all

100 ?="Range = ";
105 A=?
110 ?="min.Diff = ";
115 B=?
120 Z=3
130 C=1
140 C=C+2
150 D=1
160 D=D+2
165 X=C/D
170 #=%=0*210
180 #=C>(D*D)*160
190 #=C>(B+Z)*230
200 Z=C
210 #=A>C*140
220 ?="KEINE LOESUNG"
225 #=260
230 ?=C
240 ?=", ";
250 ?=Z
260 ?=""
270 #=100
If some of you using one or more of the given programs or perhaps a Forth version (!) and doing the necessary adaptations to get it running - post your results and I will add from time to time an entry to the table.


Cheers
Arne
Last edited by GaBuZoMeu on Sun Nov 18, 2018 4:46 pm, edited 12 times in total.
User avatar
GaBuZoMeu
Posts: 660
Joined: 01 Mar 2017
Location: North-Germany

Re: slightly OT: a simple Benchmark

Post by GaBuZoMeu »

(reserved)
User avatar
barrym95838
Posts: 2056
Joined: 30 Jun 2013
Location: Sacramento, CA, USA

Re: slightly OT: a simple Benchmark

Post by barrym95838 »

Thanks for the great chart, sir! I needed a little help with the German, but Google has been patiently helpful.

That RasPi is a potent little unit, eh?

Was the ExorSet 163 a 4MHz 6809 system?

Mike B.
User avatar
commodorejohn
Posts: 299
Joined: 21 Jan 2016
Location: Placerville, CA
Contact:

Re: slightly OT: a simple Benchmark

Post by commodorejohn »

Hmm, now I just need to figure out how to time this quasi-accurately on NetBSD/vax...no benchmark is complete without a VAX result, after all :D
User avatar
barrym95838
Posts: 2056
Joined: 30 Jun 2013
Location: Sacramento, CA, USA

Re: slightly OT: a simple Benchmark

Post by barrym95838 »

Quote:
...no benchmark is complete without a VAX result, after all :D
Agreed. It would be nice to have a Cray-1 result as well.

I want to translate the VTL02 version to optimized 65c802 and 68hc12 (and maybe even 65m32a) assembly, but I have a dozen other things on my plate at the moment ... I'll get them eventually, if someone more capable doesn't beat me to them.

Mike B.
User avatar
commodorejohn
Posts: 299
Joined: 21 Jan 2016
Location: Placerville, CA
Contact:

Re: slightly OT: a simple Benchmark

Post by commodorejohn »

(Drat...even the POSIX time functions don't track to sub-second accuracy on NetBSD/vax 6.1.5, which means that all but the last one or two example values are effectively unmeasurable...)
User avatar
GaBuZoMeu
Posts: 660
Joined: 01 Mar 2017
Location: North-Germany

Re: slightly OT: a simple Benchmark

Post by GaBuZoMeu »

commodorejohn wrote:
(Drat...even the POSIX time functions don't track to sub-second accuracy on NetBSD/vax 6.1.5, which means that all but the last one or two example values are effectively unmeasurable...)
:lol: :lol: :lol:
You may modify the input. Skip it and work with constants. Then run it thousand times or so. That should sum up to something detectable. As for the Cray, then perhaps a million runs - but not in parallel! :lol:

EDIT(1): The Exorset was running @ 2MHz for the main CPU and the one that controls the hard disk and the floppy drive. But I am not sure whether the second one was a 68B09 as well or a 68B00.
Chromatix
Posts: 1462
Joined: 21 May 2018

Re: slightly OT: a simple Benchmark

Post by Chromatix »

Copying this over from the other thread:

Noticing that there's no BBC BASIC represented - which is significant, as BBC BASIC is reputed to be very fast for an interpreted BASIC - I tried it on an emulated BBC Micro. More precisely, on the 4MHz 65C02 Second Processor that BeebEm can emulate, using HiBASIC (that is, BBC BASIC IV relocated to better use the Second Processor's extra RAM); a plain old BBC Master would be half this speed. BBC BASIC definitely does support recognising integers stored in floating-point variables, and switches to a more efficient implementation of some operators - but not all.

First I ran it with only the necessary reformatting to meet BBC BASIC's syntax rules and the addition of a built-in timer:

Code: Select all

A: 5.87
B: 9.96
C: 137.27
D: 363.96
E: 694.67
This is already comparable with some of the 16-bit CPUs, with higher clock speeds, running compiled languages, in the above table! But it was clear that the F parameters would take too long, so I skipped them.

Then I fettled it to use explicit integer variables (with % suffix) for a fairer comparison with the various compiled languages in the table. Along the way, I also cached SQR(C%) so that it didn't potentially get recalculated for every inner-loop iteration, and changed the division test to use the integer-division operator (DIV) instead of the floating-point one and a truncation. This produced a significant speedup, shaving about 40% off most of the runtimes:

Code: Select all

A: 3.56
B: 5.96
C: 78.7
D: 207.93
E: 397.85
This performance not only outstrips most of the true 16-bit machines, but it approaches the performance of a slower-clocked 6809 running what must be the output of a reasonably efficient compiler.

After a slightly more radical refactoring, I got a slight further improvement. This involved switching the outer loop from a FOR-NEXT to a REPEAT-UNTIL, swapping the DIV and multiply for a MOD, and putting the whole inner loop on one line of code.

Code: Select all

A: 3.56
B: 5.87
C: 72.92
D: 189.06
E: 357.88
Unfortunately I no longer have my original RiscPC with its 30MHz ARM610 and built-in BBC BASIC V, but I could potentially try this benchmark on a Raspberry Pi running RiscOS...
John West
Posts: 383
Joined: 03 Sep 2002

Re: slightly OT: a simple Benchmark

Post by John West »

Just for fun, I did a straight transliteration of the second Pascal version (without the sqrt) into Lua, and gave it to LuaJIT running on my work PC, which has a 3.2GHz i7-8700. It finished all versions pretty much instantly.

The average of a million (for A) to a thousand (for F) iterations were A:5us, B:10us, C:0.2ms, D:0.5ms, E:0.9ms, and F:26ms. Computers have got fast over the last 40 years.

Here's the code

Code: Select all

parameters = {
	{ "A", 1000, 20, 1000000 },
	{ "B", 2000, 30, 100000 },
	{ "C", 9999, 35, 10000 },
	{ "D", 32000, 50, 10000 },
	{ "E", 32000, 70, 10000 },
	{ "F", 500000, 100, 1000 },
}

function prim( x )
	local step = 2
	local init = 3
	local i = init
	while (i*i < x) and (x%i ~= 0) do
		i = i + step
	end
	return x < i*i
end

for _, params in pairs(parameters) do
	local startTime = os.time()
	local range = params[2]
	local mindiff = params[3]
	local numIterations = params[4]
	local loprim, hiprim
	local prim0 = 1
	local incr = 2
	for i = 1, numIterations do
		loprim = prim0
		hiprim = prim0
		local cnt = prim0
		while (cnt < range) and (hiprim-loprim < mindiff) do
			cnt = cnt + incr
			if prim(cnt) then
				loprim = hiprim
				hiprim = cnt
			end
		end
	end
	local endTime = os.time()
	if hiprim - loprim >= mindiff then
		print( params[1], hiprim, loprim, hiprim-loprim, 1000*(endTime - startTime)/numIterations )
	else
		print( "error" )
	end
end
User avatar
BigEd
Posts: 11463
Joined: 11 Dec 2008
Location: England
Contact:

Re: slightly OT: a simple Benchmark

Post by BigEd »

It's a pity that no languages I can think of offer a DIVMOD which can perform the single computation but return both the division and the remainder. So we end up spending an extra calculation - unavoidably, I think, although we can choose whether to spend a division-like or a multiplication-like calculation.

(I think it's fair to see the effect of integer as compared to floating point calculations, and arguably also fair to compare different ways of doing the SQRT vs squaring, or the DIV vs MOD methods, but there comes a point where you're comparing different algorithms, not different computer-and-language combinations. For example, instead of adding 2 every time, you can do the wheel thing. But this really is then a different algorithm.)

I'm guessing the secret weapon of the 6809 is the multiplication instruction. Perhaps a 6502 assembly language version which uses a fast multiplication would be interesting.
Chromatix
Posts: 1462
Joined: 21 May 2018

Re: slightly OT: a simple Benchmark

Post by Chromatix »

I'm actually working on a 65c02 assembly version that avoids multiplication completely.

Code: Select all

IncRootS:
	; increment the square-root and adjust its square
	; (X+1)^2 = (X^2) + 2*X + 1
	lda rootSqLo
	clc
	adc rootLo
	bcc NoCarry
	inc rootSqHi
@NoCarry:
	inc rootLo
	clc
	adc rootLo
	bcc RootDoneS
	inc rootSqHi

RootDoneS:
	; loop over possible divisors, odd numbers between 3 and square-root
	ldy #3
This is an example of a strength-reduction optimisation, applied twice: first in reducing a square-root to a multiply, and then in reducing a multiply to a pair of additions.
User avatar
BigEd
Posts: 11463
Joined: 11 Dec 2008
Location: England
Contact:

Re: slightly OT: a simple Benchmark

Post by BigEd »

That's a good idea! (But again, I'd suggest it's a different program. Indeed, it's also noteworthy that the original program is in part benchmarking the square root function, so an implementation like the ZXSpectrum's would be shown up - IIRC it has the merit of being coded in three bytes, but it uses log and exp, so is very slow indeed. Edit: oops, no, 7 bytes.)
Chromatix
Posts: 1462
Joined: 21 May 2018

Re: slightly OT: a simple Benchmark

Post by Chromatix »

Well, we're already running comparisons with compiled languages and incorporating the first step of the optimisation. Many compilers will automatically introduce the second step, so it's valid for hand-compilation as well.

The original code does incorporate the performance of SQR, but in an ambiguous manner - the number of times it is executed depends on whether the interpreter recalculates the limit of a FOR-NEXT loop on every iteration, or only on entry. Much more central to the algorithm is the performance of division (or rather, producing the remainder - the quotient can be discarded). C does have a remainder operator, as do several other languages, and that's all that's needed here.

My assembly implementation calculates the remainder in a way that inherently discards the quotient, takes advantage of the limited range of the operands (16 bit dividend by 8 bit divisor for cases A-E, and 24 bit dividend by 16 bit divisor for case F), and unrolls the loop over bit positions. It should turn out to be significantly faster than most reasonable language implementations.
Last edited by Chromatix on Tue Jul 03, 2018 12:31 pm, edited 1 time in total.
User avatar
BigEd
Posts: 11463
Joined: 11 Dec 2008
Location: England
Contact:

Re: slightly OT: a simple Benchmark

Post by BigEd »

I suppose this is where benchmarking of compiled languages gets very sticky - fine for comparing compilers, but it becomes difficult to compare CPUs if forced to use different compilers.

(I don't mean to be pedantic or super-strict about threads, but it feels like we've got two investigations going on, one which is to compare computers running a given calculation and another which is to compare ways of computing the prime number stats. It may be that a thread can happily support two interleaved ideas. I can think of no happier combination than 6502 and prime numbers! Unless it's 6502 and pi.)
Chromatix
Posts: 1462
Joined: 21 May 2018

Re: slightly OT: a simple Benchmark

Post by Chromatix »

If we were to consider algorithmic optimisations to reduce the number of divisions required, an obvious one is to keep a list of primes discovered and iterate over those when testing for primality. There are fewer than 200 primes below 1024, so a 512-byte array would be sufficient for testing 20-bit primes and could result in a 4x speedup near the upper end of that range. With slightly less indexing convenience, 24-bit primes can also easily be accommodated in a 16-bit address space.
Post Reply