Page 11 of 14
Re: slightly OT: a simple Benchmark
Posted: Sun Jul 15, 2018 2:30 am
by Chromatix
I managed to make MAME boot the correct type of machine to "Disk Extended Color BASIC", using the roms from VCC, but not into NitrOS-9 yet. The latter requires inserting a floppy disk or three, but there's no obvious UI to do that in MAME - it really is geared towards turnkey game emulation, not using old micros for general-purpose computing.
Re: slightly OT: a simple Benchmark
Posted: Sun Jul 15, 2018 2:40 am
by BigEd
Hope you can figure that out. The 6309 always looked attractive.
Ah, wait, it should be as simple as
22 X=1-X : C = C + X*2
which is to say, every other iteration you increment C by an extra 2, which skips over the multiples of three.
That might be faster as
22 X=1-X : IF X THEN C = C + 2
And of course the same for the inner loop.
Silly me, this is surely the best way:
22 X=2-X : C = C + X
It's not so easy (in Basic) to see how to extend this idea to cover multiples of 5 in an efficient way, if GOTO is expensive.
Re: slightly OT: a simple Benchmark
Posted: Sun Jul 15, 2018 4:37 am
by Chromatix
It looks like XRoar has a complete 6309 emulation (though the docs say it's "less well tested" than 6809 mode, several instructions missing from VCC are definitely implemented in the source code). It also has a more conventional "micro emulator" UI which makes loading floppy disks easy. However, it emulates only up to a CoCo2, and I have so far been unable to boot NitrOS-9 on it, even a version intended for the CoCo2.
(XRoar's name comes from the Dragon 32 and 64, which were clones of the CoCo with compatible hardware but re-implemented ROMs.)
I'm reasonably sure that most 6502 assembly can be mechanically translated to 6809 or 6309 assembly with little loss of efficiency. While the latter chips don't have post-indexed indirect, it's easy to emulate that by loading the indirect reference into the spare U register and indexing from there. The principal gotchas would be that the Motorola chips are big-endian, and use a different (and much more complex) interrupt stack frame. Any manipulation of the status register via the stack would also need to be examined, and any use of Decimal mode would need to be converted to use decimal-adjust instructions.
From there, one can start looking for ways to use the extra capabilities of the 6809 and 6309 - which are considerable. For a start, they can operate directly on 16-bit values and indexes...
Re: slightly OT: a simple Benchmark
Posted: Sun Jul 15, 2018 8:16 am
by BigEd
(Ahem, my untested scribblings might be misguided: I haven't had the energy and opportunity to test them.)
If S stands for shortcuts, and M for moduli, we could have
M=3*5*7
DIM S(M)
and then
22 IF(S(C MOD M)) NEXT C: GOTO 90
with some manual or automatic pre-filling of the S array. I would expect even
M=3*5
to be quite a help.
In general I think
22 something : IF something NEXT C: GOTO 90
will probably be a lot faster than the GOTO 80. And similarly for the inner loop.
Re: slightly OT: a simple Benchmark
Posted: Sun Jul 15, 2018 8:30 am
by GaBuZoMeu
Silly me, this is surely the best way:
22 X=2-X : C = C + X
It's not so easy (in Basic) to see how to extend this idea to cover multiples of 5 in an efficient way, if GOTO is expensive.
Yesterday evening I didn't find my mistake implementing this. (I forgot to change FOR C=5... into FOR C=3....

)
So now skipping 3 in both loops now looks like:
Code: Select all
10 ZS=5: INPUT A,B
16 X=0
20 FORC=3TOASTEP2
22 X=2-X:C=C+X
25 Y=0
30 FOR D = 3 TO SQR(C) STEP 2
35 Y=2-Y:D=D+Y
40 IF INT(C/D)*D = C THEN 80
50 NEXT D
60 IF C-ZS >= B THEN PRINT C,ZS,C-ZS, : GOTO 100
70 ZS = C
80 NEXT C
90 PRINT " KEINE LOESUNG ";
100 PRINT
110 GOTO 10
Yes, this should result in a small speed improvement. I gave it a try, here is what has happened:
Code: Select all
MS-Basic V1.1 on SYM (1MHz)
3 sorted out (IF..GOTO).....: A: 41,4 (100%) B: 67,4 (100%) C: 785,4 (100%) D: 1991,1 (100%) E: 3663,7 (100%)
BigEds \3 outer loop only...: A: 40,7 ( 98%) B: 66,5 ( 99%) C: 778,5 ( 99%) D: 1967,9 ( 99%) E: ____,_ ( %)
BigEds \3 outer & inner loop: A: 42,1 (102%) B: 68,7 (102%) C: 810,6 (103%) D: 2061,7 (103%) E: ____,_ ( %)
EhBasic V2.22 on a 2 MHz 6502
3 sorted out (IF..GOTO).....: A: 11,3 (100%) B: 19,1 (100%) C: 252,8 (100%) D: 664,1 (100%) E: 1261,1 (100%)
BigEds \3 outer loop only...: A: 11,0 ( 97%) B: 18,4 ( 96%) C: 251,0 ( 99%) D: 662,4 (100%) E: 1260,0 (100%)
BigEds \3 outer & inner loop: A: 12,6 (112%) B: 20,4 (107%) C: 273,4 (108%) D: 720,0 (108%) E: 1373,0 (109%)

surprising
Cheers,
Arne
EDITs: some late results pasted
Re: slightly OT: a simple Benchmark
Posted: Sun Jul 15, 2018 8:50 am
by BigEd
Thanks for running the numbers! It is a bit surprising. Of course, adding statements early in the program will (probably) increase the cost of later GOTO actions. It might be worth trying a change from
40 IF INT(C/D)*D = C THEN 80
to
40 IF INT(C/D)*D = C THEN NEXT C: GOTO 90
on the grounds that NEXT is quick. Then we've got rid of a frequent GOTO.
Re: slightly OT: a simple Benchmark
Posted: Sun Jul 15, 2018 8:53 am
by BigEd
Oh, and BTW, I found that in BBC Basic declaring Z%=2 and using Z% instead of 2 didn't help at all. But possibly the digestion of decimal constants will be less efficient in other Basics. (Whether Z or Z% is another choice.)
Re: slightly OT: a simple Benchmark
Posted: Sun Jul 15, 2018 8:54 am
by GaBuZoMeu
It might be worth trying a change from
40 IF INT(C/D)*D = C THEN 80
to
40 IF INT(C/D)*D = C THEN NEXT C: GOTO 90
on the grounds that NEXT is quick. Then we've got rid of a frequent GOTO.
I will give it a try. Sounds reasonable. Currently my SYMilar is still groaning

Re: slightly OT: a simple Benchmark
Posted: Sun Jul 15, 2018 9:05 am
by Chromatix
If S stands for shortcuts, and M for moduli, we could have
M=3*5*7
DIM S(M)
and then
22 IF(S(C MOD M)) NEXT C: GOTO 90
with some manual or automatic pre-filling of the S array.
That looks like it should work. An extra MOD operation in the outer loop isn't too expensive, since it will usually displace at least one from the inner loop. The array is also reasonably sized at 105 entries. However, a strength-reduction optimisation here is to have the array index maintained in parallel and incremented independently.
Even better would be to read the increment for C from the array at the *end* of the loop, and skip the test entirely. If there happens to be a power-of-two number of entries in this array, you can increment the array index modulo the array length with a bitwise mask operation, otherwise you can use a second array or one conditional.
Let me try it out in C first, then see if I can translate it to a 6502 environment. I think the 6309 will have to wait a little longer.
Re: slightly OT: a simple Benchmark
Posted: Sun Jul 15, 2018 9:31 am
by GaBuZoMeu
Well, again no real win:
40 IF INT(C/D)*D = C THEN 80
changed to
40 IF INT(C/D)*D = C THEN NEXT C : GOTO 90
yields:
Code: Select all
SYM-MS-Basic V1.1, 1MHz 6502
A: 40,8 ( 99%) B: 66,6 ( 99%) C: 783,0 (100%)
(compared to the "old" reference using IF..GOTO and THEN 80)
I will enter the corresponding numbers for EhBasic soon.
EDIT:
The "result" for EhBasic using: 40 IF INT(C/D)*D = C THEN NEXT C : GOTO 90
yields: "NEXT without FOR Error in line 40"

Re: slightly OT: a simple Benchmark
Posted: Sun Jul 15, 2018 9:46 am
by BigEd
Hmm, this is not going as well as I'd hoped...
Re: slightly OT: a simple Benchmark
Posted: Sun Jul 15, 2018 9:59 am
by GaBuZoMeu

seems to be a rare form of acceleration resistance
Nevertheless an interesting insight into EhBasic. There must be some serious differences between Eh and MS.
Re: slightly OT: a simple Benchmark
Posted: Sun Jul 15, 2018 12:23 pm
by Klaus2m5
I think the NEXT WITHOUT FOR ERROR happens due to the way EhBASIC uses the stack for both FOR NEXT and IF THEN ELSE constructs. A NEXT tries to find the FOR token at a fixed offset on the stack. After IF THEN there is an additional return address on the stack.
On the other hand a forward GOTO to a line nearby is not as bad as you think it is. It is the backward GOTO that needs to start searching for the target line from the beginning of the program as every line only has a forward pointer to the next line in EhBASIC. But that wouldn't be different in MS-BASIC.
Re: slightly OT: a simple Benchmark
Posted: Sun Jul 15, 2018 3:29 pm
by GaBuZoMeu
I think the NEXT WITHOUT FOR ERROR happens due to the way EhBASIC uses the stack for both FOR NEXT and IF THEN ELSE constructs. A NEXT tries to find the FOR token at a fixed offset on the stack. After IF THEN there is an additional return address on the stack.
Thank you for your explanation. It is perhaps as well one (of many) reason for EhBasics speed. If it has no provisions - other than reporting an error - for conditional NEXTs within a loop, there is less housekeeping necessary.
On the other hand a forward GOTO to a line nearby is not as bad as you think it is. It is the backward GOTO that needs to start searching for the target line from the beginning of the program as every line only has a forward pointer to the next line in EhBASIC. But that wouldn't be different in MS-BASIC.
Yes, now that you have said that I remember "each line starts with two 16 bit words, the first containing the line number, the next containing a pointer to the next (higher) line number". This was said elsewhere for explaining how a renumber works.
Re: slightly OT: a simple Benchmark
Posted: Sun Jul 15, 2018 5:55 pm
by Chromatix
Okay, small investigations reveal that if we take prior knowledge of all the single-digit primes, we need to wrap around the wheel at 2*3*5*7 (210), and that there are 48 candidates in the wheel. The wheel thus eliminates more than half the candidates, potentially without incurring even a single division/modulus for any of them.
For implementation, I suggest a triple nested loop:
Code: Select all
DIM WHEEL%(48), DIVS%(6550)
FOR WHEELBASE%=0 TO 1000000000 STEP 2*3*5*7
FOR I%=1 TO 48
C% = WHEELBASE% + WHEEL%(I%)
FOR J%=1 TO NUMDIVS%
D% = DIVS%(J%)
IF (C% MOD D%) = 0 THEN NEXT I% : NEXT WHEELBASE% : STOP
NEXT J%
REM FOUND A PRIME...
NEXT I% : NEXT WHEELBASE% : STOP