6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Thu Nov 21, 2024 4:52 pm

All times are UTC




Post new topic Reply to topic  [ 210 posts ]  Go to page Previous  1 ... 9, 10, 11, 12, 13, 14  Next
Author Message
PostPosted: Sun Jul 15, 2018 7:39 pm 
Offline
User avatar

Joined: Wed Mar 01, 2017 8:54 pm
Posts: 660
Location: North-Germany
Hmm, aren't there some initializations necessary? Else C% would be set to 0 and D%=DIVS%(1)=0 would cause a DIV_BY_0 error. :?:
Once a prime is found it should went into DIVS%(J) I assume?


Cheers,
Arne


Top
 Profile  
Reply with quote  
PostPosted: Sun Jul 15, 2018 7:50 pm 
Offline

Joined: Mon May 21, 2018 8:09 pm
Posts: 1462
Yes, some details were omitted for brevity. See my latest C code for how to handle the DIVS% array.

To initialise the wheel array:
Code:
J%=1
FOR I%=1 TO 2*3*5*7 STEP 2
IF (I% MOD 3)*(I% MOD 5)*(I% MOD 7) THEN WHEEL%(J%)=I% : J%=J%+1
NEXT


Top
 Profile  
Reply with quote  
PostPosted: Sun Jul 15, 2018 11:33 pm 
Offline
User avatar

Joined: Wed Mar 01, 2017 8:54 pm
Posts: 660
Location: North-Germany
OK, here are the numbers (relativ to the "classic" approach):

MS-Basic
A: 33,1 = 70,9%
B: 50,3 = 67,0%
C: 546,6 = 67,3%
D: 1377,1 = 68,4%

EhBasic:
A: 14,4 = 124%
B: 22,2 = 115%
C: 240,2 = 95,1%
D: 604,4 = 91,2%
E: 1107,8 = 88,1%

At least this elaborate approach yields an effect even with EhBasic. The results A and B are victims of the tables setup which was done every run.


Top
 Profile  
Reply with quote  
PostPosted: Mon Jul 16, 2018 6:29 am 
Offline

Joined: Mon May 21, 2018 8:09 pm
Posts: 1462
Having implemented the wheel optimisation in C, it's making very little practical difference on my Raspberry Pi. There *is* a speedup, but it's only significant at the small end of the range, which the Pi can get through in less than a second anyway. Once we get into the hundreds of millions, we're saving a few percent at best - though the code never seems to be a liability. A larger wheel would consume more memory and provide only a slight further improvement on large candidates.

The reason for this is that, while we're down to 48 from 105 candidates per 210-interval, and the remainder have three trial divisions eliminated for free, most of the time is spent on candidates which either turn out to *be* prime, or which have only relatively large prime factors; in both cases, a large number of trial divisions is still required to determine this. The Pi 2B has Cortex-A7 cores which have hardware division, so eliminating the many candidates which have small prime factors is relatively quick.

But this is on a machine which, given several hours, can actually get through all the 32-bit integers. On 8-bit and 16-bit machines - and especially in an interpreted language - optimisations which make a concrete difference in the sub-million range are worthwhile. When it comes to interpreters, however, increasing the complexity of the code may increase the overhead of interpretation, even if the underlying arithmetic operations are simplified.


Top
 Profile  
Reply with quote  
PostPosted: Tue Jul 17, 2018 9:10 am 
Offline

Joined: Mon May 21, 2018 8:09 pm
Posts: 1462
Klaus2m5 wrote:
I think the NEXT WITHOUT FOR ERROR happens due to the way EhBASIC uses the stack for both FOR NEXT and IF THEN ELSE constructs.

Curiously, the EhBASIC manual claims that IF-THEN-ELSE is treated as a single statement, so a colon acts as an ENDIF. This would appear to make it impossible to put multiple statements under a single condition without spaghetti logic. This in turn implies that there's no "stack use" by IF.

On the other hand, a single NEXT statement can apparently take multiple loop-variable arguments. Presumably these behave equivalently to consecutive NEXT statements.


Top
 Profile  
Reply with quote  
PostPosted: Tue Jul 17, 2018 10:50 am 
Offline

Joined: Sat Jul 28, 2012 11:41 am
Posts: 442
Location: Wiesbaden, Germany
Chromatix wrote:
Curiously, the EhBASIC manual claims that IF-THEN-ELSE is treated as a single statement, so a colon acts as an ENDIF. This would appear to make it impossible to put multiple statements under a single condition without spaghetti logic. This in turn implies that there's no "stack use" by IF.
The IF THEN statement1 [: statement#] behaves the same way in EhBASIC as in all other BASIC dialects: All statements are executed if the condition was TRUE, none is executed if the condition was FALSE. Only IF THEN ELSE allows for just 1 statement for each part and a colon acts as ENDIF. I'll see if I can fix the conditional NEXT issue.

Chromatix wrote:
On the other hand, a single NEXT statement can apparently take multiple loop-variable arguments. Presumably these behave equivalently to consecutive NEXT statements.
Yes and I recall that EhBASIC is not the only BASIC allowing multiple FOR loops to be closed with just 1 NEXT.

_________________
6502 sources on GitHub: https://github.com/Klaus2m5


Last edited by Klaus2m5 on Wed Jul 18, 2018 6:58 am, edited 1 time in total.

Top
 Profile  
Reply with quote  
PostPosted: Tue Jul 17, 2018 12:57 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10985
Location: England
Klaus2m5 wrote:
I'll see if I can fix the conditional NEXT issue.


That would be great!


Top
 Profile  
Reply with quote  
PostPosted: Tue Jul 17, 2018 1:16 pm 
Offline

Joined: Mon May 21, 2018 8:09 pm
Posts: 1462
What's curiously missing from most BASIC dialects (including structured dialects like BBC BASIC) is a way to cleanly break out of or move to the next iteration of a loop, without using a GOTO.


Top
 Profile  
Reply with quote  
PostPosted: Tue Jul 17, 2018 1:27 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10985
Location: England
Yes, it's normal to bump the control variable past the limit, then do NEXT, but you still have to do something at that point. In BBC Basic you could probably use an UNTIL, having set up a REPEAT previously. Or an ENDPROC, using procedures for structure. I'm not sure how the nesting for FOR, REPEAT, PROC would unwind, if they weren't strictly nested. Nor am I sure how efficient they are: I've a feeling at least one takes a copy of the 6502 stack so it can put it back when it's done.


Top
 Profile  
Reply with quote  
PostPosted: Tue Jul 17, 2018 7:53 pm 
Offline
User avatar

Joined: Thu May 28, 2009 9:46 pm
Posts: 8504
Location: Midwestern USA
Chromatix wrote:
What's curiously missing from most BASIC dialects (including structured dialects like BBC BASIC) is a way to cleanly break out of or move to the next iteration of a loop, without using a GOTO.

Business BASIC interpreters such as BBx and Thoroughbred have EXITTO, which breaks from a loop or subroutine and cleans up the stack in the process.

_________________
x86?  We ain't got no x86.  We don't NEED no stinking x86!


Top
 Profile  
Reply with quote  
PostPosted: Tue Jul 17, 2018 8:08 pm 
Offline

Joined: Thu Jan 21, 2016 7:33 pm
Posts: 282
Location: Placerville, CA
MS QuickBasic included an EXIT (structure) statement, I know, but yeah, it is an odd omission. I mean, even most Pascal variants have a BREAK statement, if I recall correctly.


Top
 Profile  
Reply with quote  
PostPosted: Sun Sep 09, 2018 6:51 pm 
Offline

Joined: Wed Jul 18, 2018 12:12 pm
Posts: 96
I ported the BASIC benchmark to the C64, well really just copied as no changes were needed. I did add in the use of the Jiffy timer to automatically measure the execution time. The times are listed in the source code below. Also note that I ran this on WinVICE and a real C64 and the execution time was the same which is nice as in VICE I can run it in 'Warp' mode which is about 1,500% of the C64 speed :) In the table showing times VBas is BASIC in VICE, CBas is BASIC on real C64, etc.

I then ported to DurexForth also on the C64. This is an ugly bit of code but it works and is still much faster than basic even though I wrote the integer square root function in Forth. I'm just dumping the Jiffy timer to the stack before the code starts and when it is done and calculating the difference by hand as I was too lazy to write a d- function (though I will likely do that and a better sqrt soon in assembly as once I figured out how to work with the Forth start from assembly it was quite easy.)

C64 BASIC
Code:
100 REM
110 REM B A S I C - B E N C H
120 REM
130 REM The program tests whether it is in the area [3.. a] of the natural
140 REM Numbers two consecutive prime numbers P1 < P2 whose
150 REM difference is greater than or equal to B (P2-P1 > = b).
160 REM
170 REM  Test values -       Results        - VBas - CBas - VForth - CForth
180 REM A    1000,20 -    907,     887,20 -    48s    48s    18s      18s
190 REM B    2000,30 -   1361,    1327,34 -   118s   118s    30s      30s
200 REM C    9999,35 -   9587,    9551,36 -  1412s  1412s   374s     374s
210 REM D   32000,50 -  19661,   19609,52 -  3516s  3516s  1530s    1530s
220 REM E   32000,70 -  31469,   31397,72 - 10447s 10477s  2352s    2352s
230 REM F 500000,100 - 370373,3790261,112 - 45819s
240 REM
250 ZS = 3 : INPUT A,B
260 TIME$ = "000000":rem reset clock
270 FOR C = 3 TO A STEP 2
280 FOR D = 3 TO SQR(C) STEP 2
290 IF INT(C/D)*D = C THEN goto 330
300 NEXT D

310 IF C-ZS >= B THEN PRINT C,ZS,C-ZS : GOTO 350
320 ZS = C
330 NEXT C
340 PRINT " no solution " : GOTO 250
350 print "time", time$: goto 250


DurexForth on C64
Code:
variable ZS 3 ZS !
variable A
variable B
variable match 0 match !

code jiffy
dex, sei,
$a1 lda, msb sta,x
$a2 lda, lsb sta,x
dex,
$00 lda,# msb sta,x
$a0 lda, cli, lsb sta,x ;code

( isqrt(n):
    x = n           \ x == old guess
    y = (x + 1) / 2 \ y == new guess
    while y < x:
        x = y
        y = (x + n / x) / 2
)
( Integer square root based on above )
( Returns the largest integer x      )
( for which x * x does not exceed n  )
: sqrt ( n -- root )
   dup 1 + 1 rshift
   begin
    2dup /include
    over + 1 rshift
    swap over >
   while repeat
   nip
;

: dloop ( cloopI -- )
   -1 nomatch !
   dup \ cloop value
     sqrt 4 max 3 do
     dup dup I / I *
     = if 0 nomatch ! leave then
   2 +loop
   drop
;

( pops all three bytes of jiffy clock
onto the stack as two unsigned doubles )
( -- ud ud)
code jiffy
dex, sei,
$a1 lda, msb sta,x
$a2 lda, lsb sta,x
dex,
$00 lda,# msb sta,x
$a0 lda, cli, lsb sta,x ;code

( A B -- )
: benchmark
   decimal
   B ! A ! \ save A&B off stack
   jiffy
   A @ 3 do
    I dloop nomatch @ if \ no match
    I ZS @ - B @ -
     0< invert if
      I . ZS @ . I ZS @ - . leave then
    I ZS !
    then 2 +loop
   jiffy
;


Top
 Profile  
Reply with quote  
PostPosted: Tue Sep 11, 2018 9:06 pm 
Offline
User avatar

Joined: Wed Mar 01, 2017 8:54 pm
Posts: 660
Location: North-Germany
Thank you very much Jeff, for the first Forth incarnation of this silly little benchy!

What puzzles me a bit is the huge deviation between these two Basic results:
Code:
SYM (6502,1MHz)    Basic1.1          BasBenc1            46,7    75,1   812,7  2014,0  3696,1   ____,_
C64 (6502, 1MHz)   Basic             BasBenc1            48,0   118,0  1412,0  3516,0 10447,0  45819,0  THX 2 Jeff_Birt

I knew the C64 runs not exactly @ 1MHz - I think it depends on PAL/NTSC. But what is the reason for the strong slow down for larger numbers? And why is the relation not constant? :?: :shock: :?:
Code:
C64/SYM                                                  1,02    1,57    1,74    1,75    2,83


Top
 Profile  
Reply with quote  
PostPosted: Mon Sep 17, 2018 3:05 am 
Offline

Joined: Mon Sep 17, 2018 2:39 am
Posts: 138
My results for FastBasic ( https://github.com/dmsc/fastbasic ) on NTSC Atari 800XL:

Code:
Atari 800XL (1.79MHz NTSC)   FastBasic_3.5      5.3     9.0      123.3    326.1    620.38


This is running the B=35 test:
Attachment:
File comment: Benchmark run with B=35
bench-run-s.png
bench-run-s.png [ 14.36 KiB | Viewed 3459 times ]


And this is the source code in the Atari IDE:
Attachment:
File comment: Benchmark source
bench-source.png
bench-source.png [ 27.1 KiB | Viewed 3459 times ]


Last edited by dmsc on Mon Sep 17, 2018 5:05 am, edited 1 time in total.

Top
 Profile  
Reply with quote  
PostPosted: Mon Sep 17, 2018 4:40 am 
Offline
User avatar

Joined: Sun Jun 30, 2013 10:26 pm
Posts: 1949
Location: Sacramento, CA, USA
Nice ... but weren't most of the NTSC 8-bit Ataris clocked at 1.79 MHz?

_________________
Got a kilobyte lying fallow in your 65xx's memory map? Sprinkle some VTL02C on it and see how it grows on you!

Mike B. (about me) (learning how to github)


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 210 posts ]  Go to page Previous  1 ... 9, 10, 11, 12, 13, 14  Next

All times are UTC


Who is online

Users browsing this forum: Google [Bot] and 7 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: