6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Sat Nov 23, 2024 8:22 am

All times are UTC




Post new topic Reply to topic  [ 210 posts ]  Go to page 1, 2, 3, 4, 5 ... 14  Next
Author Message
PostPosted: Mon Jul 02, 2018 11:39 pm 
Offline
User avatar

Joined: Wed Mar 01, 2017 8:54 pm
Posts: 660
Location: North-Germany
A question about execution speed comparisons http://forum.6502.org/viewtopic.php?f=5&p=60986#p60969 causes me to present a table where I have noted the execution speed of various computers running a simple program. In order not to stress the original topic I moved to this place.
Code:
                                    B A S I C - B E N C H M A R K  (from BASBENCH.TAB, 28.08.93)
                                    =============================

┌───────────────────┬──────────────────────────────╥─────────────────────┬────────────────────────────┐
│      EINGABE      │          ERGEBNISSE          ║          EINGABE    │          ERGEBNISSE        │
│A:    1000 ,  20   │    907  ,      887  ,    20  ║  D:   32000 ,  50   │  19661  ,    19609  ,    52│
│B:    2000 ,  30   │   1361  ,     1327  ,    34  ║  E:   32000 ,  70   │  31469  ,    31397  ,    72│
│C:    9999 ,  35   │   9587  ,     9551  ,    36  ║  F:  500000 , 100   │ 370373  ,   370261  ,   112│
└───────────────────┴──────────────────────────────╨─────────────────────┴────────────────────────────┘
Ausführungszeiten verschiedener Rechner/Programme (vgl. ggf. Listings, alle Zeitangaben in Sekunden) :
──────────────────────────────────────────────────────────────────────────────────────────────────────
Rechner/CPU-Typ    Progr.sprache     Listing               A       B       C       D       E        F
-----------------  ----------------  -----------------  -----  ------  ------  ------  ------   ------
SYM (6502,1MHz)    Basic1.1          BasBenc1            46,7    75,1   812,7  2014,0  3696,1   ____,_
CBM 3032 (-"-)     Rom-Basic         BasBenc1            48,0    80.0  ____,_  ____,_  ____,_   ____,_
Badge (6502, 2MHz) VTL02C            ~Basic,UINT,MOD      9,6    16,3   222,8   590,0  1126,6
       "           EhBasic V2.22     BasBenc1            11,6    19,3   252,6   662,4  1257,3
∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙
APPLE II           Rom-Basic         BasBenc1            44,8    71,8   777,0  ____,_  ____,_   ____,_
       "           Woz Basic         ~VTL02              22,5    38,5   526,0
Apple //e, 128KB   Plasma2.0 (JIT)   ~C(V2) (int)         5,72    9,3   118,36
∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙
Atari 800XL        FastBasic_3.5     ~C(V2)(modulo)       5,3     9,0   123,3   326,1   620,4           THX 2 dsmc (post#63001)
∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙
C64 (6502, 1MHz)   Basic             BasBenc1            48,0   118,0  1412,0  3516,0 10447,0  45819,0  THX 2 Jeff_Birt
       "           DurexForth        very close to Basic 18,0    30,0   374,0  1530,0  2352,0           see post #62871
       "           DurexForth        -"- + sqrt2         18,0    26,0   334,0   861,0  1615,0           see post #63055
       "           DurexForth        -"- + sqrt2 in asm  12,0    19,0   276,0   740,0  1416,0           
       "           DurexForth        ~C(V2)(modulo..)    11,0    18,0   242,0   637,0  1209,0           
∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙
65C02 homebrew     dflat  Basic       int, modulo         3,95    6,51   81,61                          - 65C02 @ 5,36MHz; THX 2 dolomiah
65C02 pocket       EhBasic 2.22p4C    BasBenc1            2,78    4,68   61,92  162,51  308,33          - 65C02 @ 8Mhz; THX 2 floobydust
∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙
OSI 600(Superbrd.) OSI Basic V1.0 (MS)   "               25,3    52,1   642,4                           - 6502 @ 0.9825Mhz
Jaguar (homebrew)  EhBasic V2.22         "                1,46    2,42   31,40   82,29  156,06          - W65C02 @ 16MHz
∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙
BBC (65C02, 4MHz)  HiBASIC (IV)      ~BasBenc1            5,87    9,96  137,27  363,96  694,67
       "           running on 2.µP   +% cached SQR()      3,56    5,96   78,7   207,93  397,85
B-Em (6502, 4MHz)  Assembler         ~Pascal V2           0,09    0,15    2,54    7,70   17,53   856,79 ~ var.len.div., THX 2 Chromatix
∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙
65020 (5MHz)       Assembler         ~Pascal V2           0,039   0,066   0,92    2,44    4,64   140,05
∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙
MC68000 (4MHz)     Assembler         Assembler V1.0 int   0,x     0,x     3,9     9,9    18,5   ____,_
       "                  "          Assembler V1.1 int   0,x     0,x     3,0     7,6    14,1   ____,_
∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙
TRS-80 M 1 (Z80)   Level2-Basic      BasBenc1            68,7   112,1  ____,_  ____,_  ____,_   ____,_
∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙
ExorSet 163        Basic09           ≈ Pascal V1.0       10,5    16,5   183,8  ____,_  ____,_   ____,_
CoCo3 (0,89MHz)    Disk Ext.Basic1.1 BasBenc1            60,6    97,9  1090,1
  "   (1.78Mhz)         "                "               30,4    49,0   545,1
∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙
EurocomII (6809)   Basic V3.1        BasBenc1            18,3    30,6   403,2  ____,_  ____,_   ____,_
E II V7 (1.34MHz)  Extended Basic    BasBenc1            55,6    93,4  ____,_  ____,_  ____,_   ____,_
       "           ( TSC GBAS )      Integervariable     42,1    69,8  ____,_  ____,_  ____,_   ____,_
       "           OmegaSoft PASCAL  s. Listing V1.0      5,2     8,2    81,2   196,8   352,0   ____,_
       "           Compiler V 1.10   s. Listing V2.0      3,4     5,5    67,5   173,2   323,1   ____,_
       "           Windrush C Comp.  V2.0 (int)           2,4     4,1    51,0   131,8   245,8   ____,_
       "           noStkchk+optimize V2.0 (long)         10,8    18,3   249.5   654,6  1236,1   ____,_
∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙
Amiga 2000 (68K)   AmigaBasic V      BasBenc1, ShortFP    9,3    15,3   190,5  ____,_  ____,_   ____,_
       "                  "          BasBenc1, Integer    8,0    13,0   158,1  ____,_  ____,_   ____,_
       "           Lattice-C V3.10   C-Listing V1 ieee    6,4    11,0   152,7   401,8   784,0   ____,_
       "                  "                       ffp     1,8     2,8    35,7    94,4   179,1   5740,0
       "                  "          C-Listing V2 long    0,x     0,x     6,6    16,8    31,7   1094,2
       "           AM-Modula-2 V3.1  Modula2-Vers. 1.0    1,2     1,8    12,5    27,0  ____,_   ____,_
       "                  "          Modula2-Vers. 1.1    1,3     1,9    13,7    30,2    52,3   1013,1
       "                  "          Modula2-Vers. 2.0    0,x     0,x     8,2    20,8    39,3   1129,2
∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙
PC-XT              TurboPascal 3.02  Pascal V1.0 (16Bit)  2,4     3,5    25,6    51,9  ____,_   ____,_
(8088,10 MHz)             "          Pascal V2.0 (16Bit)  0,x     0,x     4,2    10,5  ____,_   ____,_
       "           TurboPascal 4.0   Pascal V1.0 (16Bit)  2,0     3,0    20,6    41,7  ____,_   ____,_
(V20 , 10 MHz)            "                   "           1,9     2,7    19,1    38,2    65,2   ____,_
(8088, 10 MHz)            "          Pascal V2.0 (16Bit)  0,x     0,x     3,4     8,3  ____,_   ____,_
(V20 , 10 MHz)            "                   "           0,x     0,x     2,1     5,0     9,1   ____,_
(8088, 10 MHz)            "          Pascal V2.0 (32Bit)  0,x     0,x  ____,_    68,7  ____,_   ____,_
(V20 , 10 MHz)            "                   "           0,x     0,x    23,9    63,0   119,9   ____,_
∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙
(8088, 10 MHz)     Turbo-C V1.5      C-Listing V1.0       5,8     9,9   137,2  ____,_  ____,_   ____,_
(V20 , 10 MHz)            "                   "           5,0     8,6   118,7  ____,_  ____,_   ____,_
(8088, 10 MHz)            "          C-Listing V2.0 long  0,x     0,x     8,5    21,9  ____,_   ____,_
(V20 , 10 MHz)            "                   "           0,x     0,x     7,0    18,0    33,8   3602,0
       "                  "          C-Listing V2.0 int   0,x     0,x     1,2     2,8     5,1   ____,_
(Inboard-PC,16MHz)        "          C-Listing V2.0 long                                  9,7    976,2
(      "         )        "          ---"--- + 286-Option                                 9,4   1019,4 !!
(      "         )        "          C V3.0 long, ohne 286                                9,2    982,8
(      "         )        "          ---"---, mit 286,wordaligned                         9,3    986,8
∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙
(8088, 10 MHz)     GW-Basic V3.22    BasBenc1, ShortFP    9,3    15,3   195,7  ____,_  ____,_   ____,_
(V20 , 10 MHz)            "                   "           8,7    14,4   183,8  ____,_  ____,_   ____,_
(8088, 10 MHz)     GW-Basic V3.22    BasBenc1, Integer    7,7    12,6   156,5  ____,_  ____,_   ____,_
(V20 , 10 MHz)            "                   "           7,3    12,0   148,4  ____,_  ____,_   ____,_
(V20 , 10 MHz)            "          40 C% mod D% = 0 ?   6,1     9,7   115,7  ____,_  ____,_   ____,_
∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙
80552 (11.06MHz)   MI-C8051 V-5x218  C-Listing V2.0 int   2,1     3,5    46,9   125,3   239,6   ____,_
       "                  "          C-Listing V2.0 long  4,8     8,1   113,8   304,4   583,1  18210,0
∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙
T800 (20MHz)       Occam V.0.88      Occam V.1.0 (32Bit)  0,x     0,x     0,3     0,8     1,5     45,5
∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙
386DX40 128K Cache Quick C 1.0       BasBenc2.C (long)    all optimizations for speed     3,1     84,9
  + CoPro               "            BasBenc1.C (float)                                   8,2    240,6
∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙
∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙
RasPi 2B (0,9GHz)  Python 3.4.2      prime2.py            0,01    0,02    0,21    0,53    0,93    28,48
       "           pypy3-2.4.0            "                                               0,14     1,45
       "           C-Cortex-A7       (Chromatix)                          0,002   0,004   0,007    0,183
       "           gcc                    "                               0,005   0,013   0,025    0,804
∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙
Arduino Nano V3    Sketch V1.8.5     ~C(V2) int           0,032   0,055   0,80    2,15    4,11
(16MHz ATmega328)                    ~C(V2) long          0,094   0,163   2,33    6,38   11,91   366,4
∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙
∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙
PC,i7-8700,3.2GHz  LuaJIT            ~PASCAL(V2)          0,005   0,010   0,2     0,5     0,9     26,0 <= ms !!!
∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙
MicroVAX 3100/90   GCC for VAX       C(V2) (int32)        2,48    4,17   53,61  139,26  262,22  7653,1 <= ms !!!
       "                             C(V2) (int64)       13,24   22,61  312,63  826,16 1570,51 46712,3 <= ms !!!
∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙∙





The origin of the table starts back around 1980/81 when I played on a SYM-1 with 4K RAM and Synertek-BASIC (=Microsoft). The benchmark program simply calculates prime number gaps: "BasBenc1"
Code:
  1 REM
  2 REM         B A S I C - B E N C H
  3 REM
  4 REM Das Programm testet, ob es im Bereich [3..A] der natürlichen
  5 REM Zahlen zwei aufeinander folgende Primzahlen P1 < P2 gibt, deren
  6 REM Differenz größer oder gleich B ist ( P2 - P1 >= B ).
  7 REM
  8 REM
 10 ZS = 3 : INPUT A,B
 20 FOR C = 3 TO A STEP 2
 30 FOR D = 3 TO SQR(C) STEP 2
 40 IF INT(C/D)*D = C THEN 80
 50 NEXT D
 60 IF C-ZS >= B THEN PRINT C,ZS,C-ZS : GOTO 10
 70 ZS = C
 80 NEXT C
 90 PRINT " KEINE LOESUNG " : GOTO 10

I tried to gain more speed using the ability of MS-Basic to deal with integers instead of floats:
Code:
1  REM Basic-Bench á la SYM etc.
2  REM **** Integerversion ****
10 ZS% = 3: INPUT A%,B%
20 FOR C% = 3 TO A% STEP 2
30 FOR D% = 3 TO SQR(C%) STEP 2
40 IF (C%\D%)*D% = C% THEN 80
50 NEXT D%
60 IF C%- ZS% >= B% THEN PRINT C%,ZS%,C%-ZS%: GOTO 10
70 ZS% = C%
80 NEXT C%
90 PRINT "keine Lösung gefunden !": GOTO 10

Using modulo instead of (C%\D%)*D% was the last step:
Code:
1  REM Basic-Bench á la SYM etc.
2  REM *** Integerversion + Modulofunktion ***
10 ZS% = 3: INPUT A%,B%
20 FOR C% = 3 TO A% STEP 2
30 FOR D% = 3 TO SQR(C%) STEP 2
40 IF C% MOD D% = 0 THEN 80
50 NEXT D%
60 IF C%- ZS% >= B% THEN PRINT C%,ZS%,C%-ZS%: GOTO 10
70 ZS% = C%
80 NEXT C%
90 PRINT "keine Lösung gefunden !": GOTO 10

On a 6809 system running FLEX-9 I could try more languages: Basic, Pascal, and C. The first Pascal attempt was a straight translation:
Code:
program BENCH1( INPUT,OUTPUT );

{ Programm ist Pascal-Version vom Basic-Bench.
  Es testet ob es im Bereich 3 bis A zwei aufeinanderfolgende
  Primzahlen (p1,p2) gibt, so das gilt : p2-p1 >= B }

const prim0 = 3; incr = 2;
var range,mindiff,prim1,prim2,cnt1,cnt2,limit1 : integer;
    flag1 : boolean;

begin { main }

range := 0; mindiff := 0; prim1 := prim0; prim2 := prim1; cnt1 := prim0;

while range <= prim0 do begin
  writeln; write(' Gebe obere Grenze und minimale Differenz ein : ');
  readln( range,mindiff ); writeln
  end;

while (cnt1 <= range) and (prim2-prim1 < mindiff) do begin
  limit1 := round(sqrt(cnt1));
  flag1 := true; cnt2 := prim0;
  while (cnt2 <= limit1) and flag1 do begin
    flag1 := not( cnt1 mod cnt2 = 0 );
    cnt2 := cnt2 + incr
    end;
  if flag1 then begin
    prim1 := prim2 ; prim2 := cnt1
    end;
  cnt1 := cnt1 +incr
  end;

if prim2-prim1 >= mindiff
  then writeln(' Ergebnis : ',prim2:8,prim1:8,prim2-prim1:8)
  else writeln(' keine Loesung gefunden ')

end.

Then I discovered a method to get around using SQRT():
Code:
program BENCH2( INPUT,OUTPUT );

{ Programm ist Pascal-Version vom Basic-Bench.
  Es testet ob es im Bereich 3 bis A zwei aufeinanderfolgende
  Primzahlen (p1,p2) gibt, so das gilt : p2-p1 >= B .
  Im Gegensatz zur Version BENCH1, ist diese pascaltypisch geschrieben. }

const prim0 = 1; incr = 2;
var range,mindiff,cnt,hiprim,loprim : integer;

function prim(x : integer): boolean;
const step =2; init = 3;
var i : integer;
begin
  i := init;
  while (( i*i < x ) and ( x mod i <> 0 )) do
    i := i + step;
  prim :=  x < i*i ;
end; { of prim }

begin { main }

  writeln; write(' Gebe obere Grenze und minimale Differenz ein : ');
  readln( range,mindiff ); writeln;

  loprim := prim0;
  hiprim := prim0;
  cnt := prim0;

  while (cnt < range) and (hiprim-loprim < mindiff) do begin
    cnt := cnt + incr;
    if prim(cnt) then begin
      loprim := hiprim;
      hiprim := cnt
    end;
  end;

  writeln;
  if hiprim - loprim >= mindiff
  then writeln(' Ergebnis : ',hiprim:8,loprim:8,hiprim-loprim:8)
  else writeln(' keine Loesung gefunden ')

end.

The C code using floats is lost, but it should correspond to the first Pascal version. The V2 variant using integers:
Code:
#include <stdio.h>
/* #include <math.h> */

int getnum()
{ char s[80];
  gets(s);
  return(atoi(s));
}

int prim(x) /* if x is prime return 1 else 0 */
int x;
{ int i = 3;
  while ((i*i < x) && (x % i) != 0)
    i+=2;
  if (x < i*i) return(1);
  else         return(0);
}

void main()
{ int a,b,i,loprim,hiprim;
  printf("\nGeben Sie den oberen Grenzwert ein :");
  a = getnum();
  printf("\nGeben Sie die minimale Differenz ein :");
  b = getnum();
  loprim = hiprim = 1;
  for (i=3; i <= a && (hiprim-loprim < b); i+=2)
  { if (prim(i))
    { loprim = hiprim; hiprim = i;
    }
  };
  if (hiprim-loprim < b)
    printf("keine Loesung gefunden !\n");
  else
    printf("%12d %12d \n",hiprim,loprim);
}

Using longs:
Code:
#include <stdio.h>
#include <ctype.h>
/* #include <math.h> */

long getnum()
{ char c;
  long res=0;
  while(isdigit(c=getchar()))
    res = res*10+(c & 15);
  return(res);
}

int prim(x) /* if x is prime return 1 else 0 */
long x;
{ long i = 3;
  while ((i*i < x) && (x % i) != 0)
    i+=2;
  if (x < i*i) return(1);
  else         return(0);
}

void main()
{ long a,b,i,loprim,hiprim;
  printf("\nGeben Sie den oberen Grenzwert ein :");
  a = getnum();
  printf("\nGeben Sie die minimale Differenz ein :");
  b = getnum();
  loprim = hiprim = 1;
  for (i=3; i <= a && (hiprim-loprim < b); i+=2)
  { if (prim(i))
    { loprim = hiprim; hiprim = i;
    }
  };
  if (hiprim-loprim < b)
    printf("keine Loesung gefunden !\n");
  else
    printf("%12ld %12ld \n",hiprim,loprim);
}

The latest written version was in VTL02C (thanks to Mike B. and Klaus):
Code:
100 ?="Range = ";
105 A=?
110 ?="min.Diff = ";
115 B=?
120 Z=3
130 C=1
140 C=C+2
150 D=1
160 D=D+2
165 X=C/D
170 #=%=0*210
180 #=C>(D*D)*160
190 #=C>(B+Z)*230
200 Z=C
210 #=A>C*140
220 ?="KEINE LOESUNG"
225 #=260
230 ?=C
240 ?=", ";
250 ?=Z
260 ?=""
270 #=100


If some of you using one or more of the given programs or perhaps a Forth version (!) and doing the necessary adaptations to get it running - post your results and I will add from time to time an entry to the table.


Cheers
Arne


Last edited by GaBuZoMeu on Sun Nov 18, 2018 4:46 pm, edited 12 times in total.

Top
 Profile  
Reply with quote  
PostPosted: Tue Jul 03, 2018 12:09 am 
Offline
User avatar

Joined: Wed Mar 01, 2017 8:54 pm
Posts: 660
Location: North-Germany
(reserved)


Top
 Profile  
Reply with quote  
PostPosted: Tue Jul 03, 2018 2:02 am 
Offline
User avatar

Joined: Sun Jun 30, 2013 10:26 pm
Posts: 1949
Location: Sacramento, CA, USA
Thanks for the great chart, sir! I needed a little help with the German, but Google has been patiently helpful.

That RasPi is a potent little unit, eh?

Was the ExorSet 163 a 4MHz 6809 system?

Mike B.


Top
 Profile  
Reply with quote  
PostPosted: Tue Jul 03, 2018 2:11 am 
Offline

Joined: Thu Jan 21, 2016 7:33 pm
Posts: 282
Location: Placerville, CA
Hmm, now I just need to figure out how to time this quasi-accurately on NetBSD/vax...no benchmark is complete without a VAX result, after all :D


Top
 Profile  
Reply with quote  
PostPosted: Tue Jul 03, 2018 2:39 am 
Offline
User avatar

Joined: Sun Jun 30, 2013 10:26 pm
Posts: 1949
Location: Sacramento, CA, USA
Quote:
...no benchmark is complete without a VAX result, after all :D

Agreed. It would be nice to have a Cray-1 result as well.

I want to translate the VTL02 version to optimized 65c802 and 68hc12 (and maybe even 65m32a) assembly, but I have a dozen other things on my plate at the moment ... I'll get them eventually, if someone more capable doesn't beat me to them.

Mike B.


Top
 Profile  
Reply with quote  
PostPosted: Tue Jul 03, 2018 3:33 am 
Offline

Joined: Thu Jan 21, 2016 7:33 pm
Posts: 282
Location: Placerville, CA
(Drat...even the POSIX time functions don't track to sub-second accuracy on NetBSD/vax 6.1.5, which means that all but the last one or two example values are effectively unmeasurable...)


Top
 Profile  
Reply with quote  
PostPosted: Tue Jul 03, 2018 6:45 am 
Offline
User avatar

Joined: Wed Mar 01, 2017 8:54 pm
Posts: 660
Location: North-Germany
commodorejohn wrote:
(Drat...even the POSIX time functions don't track to sub-second accuracy on NetBSD/vax 6.1.5, which means that all but the last one or two example values are effectively unmeasurable...)

:lol: :lol: :lol:
You may modify the input. Skip it and work with constants. Then run it thousand times or so. That should sum up to something detectable. As for the Cray, then perhaps a million runs - but not in parallel! :lol:

EDIT(1): The Exorset was running @ 2MHz for the main CPU and the one that controls the hard disk and the floppy drive. But I am not sure whether the second one was a 68B09 as well or a 68B00.


Top
 Profile  
Reply with quote  
PostPosted: Tue Jul 03, 2018 9:42 am 
Offline

Joined: Mon May 21, 2018 8:09 pm
Posts: 1462
Copying this over from the other thread:

Noticing that there's no BBC BASIC represented - which is significant, as BBC BASIC is reputed to be very fast for an interpreted BASIC - I tried it on an emulated BBC Micro. More precisely, on the 4MHz 65C02 Second Processor that BeebEm can emulate, using HiBASIC (that is, BBC BASIC IV relocated to better use the Second Processor's extra RAM); a plain old BBC Master would be half this speed. BBC BASIC definitely does support recognising integers stored in floating-point variables, and switches to a more efficient implementation of some operators - but not all.

First I ran it with only the necessary reformatting to meet BBC BASIC's syntax rules and the addition of a built-in timer:
Code:
A: 5.87
B: 9.96
C: 137.27
D: 363.96
E: 694.67

This is already comparable with some of the 16-bit CPUs, with higher clock speeds, running compiled languages, in the above table! But it was clear that the F parameters would take too long, so I skipped them.

Then I fettled it to use explicit integer variables (with % suffix) for a fairer comparison with the various compiled languages in the table. Along the way, I also cached SQR(C%) so that it didn't potentially get recalculated for every inner-loop iteration, and changed the division test to use the integer-division operator (DIV) instead of the floating-point one and a truncation. This produced a significant speedup, shaving about 40% off most of the runtimes:
Code:
A: 3.56
B: 5.96
C: 78.7
D: 207.93
E: 397.85

This performance not only outstrips most of the true 16-bit machines, but it approaches the performance of a slower-clocked 6809 running what must be the output of a reasonably efficient compiler.

After a slightly more radical refactoring, I got a slight further improvement. This involved switching the outer loop from a FOR-NEXT to a REPEAT-UNTIL, swapping the DIV and multiply for a MOD, and putting the whole inner loop on one line of code.
Code:
A: 3.56
B: 5.87
C: 72.92
D: 189.06
E: 357.88


Unfortunately I no longer have my original RiscPC with its 30MHz ARM610 and built-in BBC BASIC V, but I could potentially try this benchmark on a Raspberry Pi running RiscOS...


Top
 Profile  
Reply with quote  
PostPosted: Tue Jul 03, 2018 10:00 am 
Offline

Joined: Tue Sep 03, 2002 12:58 pm
Posts: 336
Just for fun, I did a straight transliteration of the second Pascal version (without the sqrt) into Lua, and gave it to LuaJIT running on my work PC, which has a 3.2GHz i7-8700. It finished all versions pretty much instantly.

The average of a million (for A) to a thousand (for F) iterations were A:5us, B:10us, C:0.2ms, D:0.5ms, E:0.9ms, and F:26ms. Computers have got fast over the last 40 years.

Here's the code
Code:
parameters = {
   { "A", 1000, 20, 1000000 },
   { "B", 2000, 30, 100000 },
   { "C", 9999, 35, 10000 },
   { "D", 32000, 50, 10000 },
   { "E", 32000, 70, 10000 },
   { "F", 500000, 100, 1000 },
}

function prim( x )
   local step = 2
   local init = 3
   local i = init
   while (i*i < x) and (x%i ~= 0) do
      i = i + step
   end
   return x < i*i
end

for _, params in pairs(parameters) do
   local startTime = os.time()
   local range = params[2]
   local mindiff = params[3]
   local numIterations = params[4]
   local loprim, hiprim
   local prim0 = 1
   local incr = 2
   for i = 1, numIterations do
      loprim = prim0
      hiprim = prim0
      local cnt = prim0
      while (cnt < range) and (hiprim-loprim < mindiff) do
         cnt = cnt + incr
         if prim(cnt) then
            loprim = hiprim
            hiprim = cnt
         end
      end
   end
   local endTime = os.time()
   if hiprim - loprim >= mindiff then
      print( params[1], hiprim, loprim, hiprim-loprim, 1000*(endTime - startTime)/numIterations )
   else
      print( "error" )
   end
end


Top
 Profile  
Reply with quote  
PostPosted: Tue Jul 03, 2018 11:40 am 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10986
Location: England
It's a pity that no languages I can think of offer a DIVMOD which can perform the single computation but return both the division and the remainder. So we end up spending an extra calculation - unavoidably, I think, although we can choose whether to spend a division-like or a multiplication-like calculation.

(I think it's fair to see the effect of integer as compared to floating point calculations, and arguably also fair to compare different ways of doing the SQRT vs squaring, or the DIV vs MOD methods, but there comes a point where you're comparing different algorithms, not different computer-and-language combinations. For example, instead of adding 2 every time, you can do the wheel thing. But this really is then a different algorithm.)

I'm guessing the secret weapon of the 6809 is the multiplication instruction. Perhaps a 6502 assembly language version which uses a fast multiplication would be interesting.


Top
 Profile  
Reply with quote  
PostPosted: Tue Jul 03, 2018 11:48 am 
Offline

Joined: Mon May 21, 2018 8:09 pm
Posts: 1462
I'm actually working on a 65c02 assembly version that avoids multiplication completely.
Code:
IncRootS:
   ; increment the square-root and adjust its square
   ; (X+1)^2 = (X^2) + 2*X + 1
   lda rootSqLo
   clc
   adc rootLo
   bcc NoCarry
   inc rootSqHi
@NoCarry:
   inc rootLo
   clc
   adc rootLo
   bcc RootDoneS
   inc rootSqHi

RootDoneS:
   ; loop over possible divisors, odd numbers between 3 and square-root
   ldy #3

This is an example of a strength-reduction optimisation, applied twice: first in reducing a square-root to a multiply, and then in reducing a multiply to a pair of additions.


Top
 Profile  
Reply with quote  
PostPosted: Tue Jul 03, 2018 12:02 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10986
Location: England
That's a good idea! (But again, I'd suggest it's a different program. Indeed, it's also noteworthy that the original program is in part benchmarking the square root function, so an implementation like the ZXSpectrum's would be shown up - IIRC it has the merit of being coded in three bytes, but it uses log and exp, so is very slow indeed. Edit: oops, no, 7 bytes.)


Top
 Profile  
Reply with quote  
PostPosted: Tue Jul 03, 2018 12:04 pm 
Offline

Joined: Mon May 21, 2018 8:09 pm
Posts: 1462
Well, we're already running comparisons with compiled languages and incorporating the first step of the optimisation. Many compilers will automatically introduce the second step, so it's valid for hand-compilation as well.

The original code does incorporate the performance of SQR, but in an ambiguous manner - the number of times it is executed depends on whether the interpreter recalculates the limit of a FOR-NEXT loop on every iteration, or only on entry. Much more central to the algorithm is the performance of division (or rather, producing the remainder - the quotient can be discarded). C does have a remainder operator, as do several other languages, and that's all that's needed here.

My assembly implementation calculates the remainder in a way that inherently discards the quotient, takes advantage of the limited range of the operands (16 bit dividend by 8 bit divisor for cases A-E, and 24 bit dividend by 16 bit divisor for case F), and unrolls the loop over bit positions. It should turn out to be significantly faster than most reasonable language implementations.


Last edited by Chromatix on Tue Jul 03, 2018 12:31 pm, edited 1 time in total.

Top
 Profile  
Reply with quote  
PostPosted: Tue Jul 03, 2018 12:26 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10986
Location: England
I suppose this is where benchmarking of compiled languages gets very sticky - fine for comparing compilers, but it becomes difficult to compare CPUs if forced to use different compilers.

(I don't mean to be pedantic or super-strict about threads, but it feels like we've got two investigations going on, one which is to compare computers running a given calculation and another which is to compare ways of computing the prime number stats. It may be that a thread can happily support two interleaved ideas. I can think of no happier combination than 6502 and prime numbers! Unless it's 6502 and pi.)


Top
 Profile  
Reply with quote  
PostPosted: Tue Jul 03, 2018 1:10 pm 
Offline

Joined: Mon May 21, 2018 8:09 pm
Posts: 1462
If we were to consider algorithmic optimisations to reduce the number of divisions required, an obvious one is to keep a list of primes discovered and iterate over those when testing for primality. There are fewer than 200 primes below 1024, so a 512-byte array would be sufficient for testing 20-bit primes and could result in a 4x speedup near the upper end of that range. With slightly less indexing convenience, 24-bit primes can also easily be accommodated in a 16-bit address space.


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 210 posts ]  Go to page 1, 2, 3, 4, 5 ... 14  Next

All times are UTC


Who is online

Users browsing this forum: Google [Bot] and 4 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
cron