6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Mon Oct 07, 2024 1:33 pm

All times are UTC




Post new topic Reply to topic  [ 54 posts ]  Go to page Previous  1, 2, 3, 4  Next
Author Message
 Post subject: Re: Benchmarking
PostPosted: Sat Nov 07, 2020 9:03 pm 
Offline

Joined: Sat Jul 09, 2016 6:01 pm
Posts: 180
I have just added data for the Atari 800XL. Its Basic was really slow. It is interesting that Atari enthusiasts have completely rewritten ROM (Altirra ROM Basic, 2014) and this Basic about 160% faster for the ASCII Mandelbrot!
EDIT. It is also worth to mention that there is no way to write 100% portable Basic program. Atari Basic works with strings differently, and it doesn't support the timer. So I had to add or change the next lines.
Code:
164 DIM C$(19)
166 POKE 82,0 : POKE 83,39 : GRAPHICS0 : REM PROPER MARGINS
170 C$ = ".,'-=+:;[/<&?oxOX# " : REM 'PALLET' LIGHTEST TO DARKEST... - no tilda
290 Q0 = PEEK(20): Q1 = PEEK(19): Q2 = PEEK(18)
460 PRINT C$(CC - 1, CC - 1);
470 NEXT X
490 NEXT Y
505 T0 = PEEK(20): T1 = PEEK(19): T2 = PEEK(18)
510 PRINT ((T2 - Q2)*65536 + (T1 - Q1)*256 + (T0 - Q0)) / 50 : REM 60 FOR NTSC

_________________
my blog about processors


Last edited by litwr on Sun Nov 08, 2020 9:48 am, edited 1 time in total.

Top
 Profile  
Reply with quote  
 Post subject: Re: Benchmarking
PostPosted: Sun Nov 08, 2020 8:07 am 
Offline

Joined: Sat Jul 09, 2016 6:01 pm
Posts: 180
cjs wrote:
litwr wrote:
I have also added data for the MSX2 which shows a surprisingly slow result.

It's not really all that surprising, no. Unlike other Microsoft BASICs, MSX BASIC uses decimal rather than binary significands, which is probably a performance hit right there, and, more importantly, in your case is probably using double precision floating point (14 digit significands and a 7-bit binary exponent).

I have tried to replicate your benchmark (more on this below) and came out with 521.9 seconds (versus your 554.98) based on the TIME variable and confirmed roughly against a wall clock on my Japanese Sanyo MPC-2 (Wavy2), which is an NTSC MSX1 machine running an earlier version of BASIC (1.0 instead of 2.0).

In the benchmark above I added a line 165 DEFDBL A-Z to ensure that I was using double precision (though this is the default, and the result is the same without it). Changing that to DEFSNG uses 6-digit significands which is I think is a more fair comparison to other MS-BASICs and seems to produce the same visual output. This takes 413.37 seconds, a 21% improvement, which puts the MSX machine between the Commodore Plus/4 (MSX1 being 15% faster) and the Commodore 64 (MSX1 being 7% slower).
(I've also run the benchmarks above on a Japanese Sony HB-55, also MSX1, and came out with the same numbers within .02%: 521.98 s and 413.45 s.)

Regarding your post, especially in its original version, you could have made it a lot more clear what changes you were making to the original benchmark code. It took me some time to replicate your output, and it didn't help that for the MSX one you've both lost the top line of the output and terminated the program differently (or not terminated at all, I think; I replicated with 515 GOTO 515 to prevent the program from exiting). You should also check the result from TIME against a wall clock; since TIME doesn't increment when interrupts are disabled it may undercount the time used. (I confirmed that it's very close on my machine, but this may vary across MSX models.) You should also mention that dividing TIME by 60 to get seconds applies only to NTSC MSX models; since it's based on the video refresh interrupt this would need to be 50 on PAL machines.

I'm not clear on why your MSX2 is noticeably slower (6%) than my MSX1, despite being a newer machine running a newer version of BASIC. Perhaps it's actually faster and you're using a PAL model? Without knowing what hardware you tested on, it's hard even to guess. (I'll have another go at this on one of my MSX2 machines if I can get one up and running; I'm currently waiting on a power supply for them.)

I just use Open MSX emulator with the Sanyo MPC-25FD. I can only suppose that maybe MSX1 Basic is a bit faster than more complex MSX2 Basic. The CPU frequency are the same in the MSX1 and MSX2. Maybe the observed 5% difference is also caused by small frequency difference for those computer models.
EDIT. Small timing difference may also be caused by different interrupt handlers.

_________________
my blog about processors


Top
 Profile  
Reply with quote  
 Post subject: Re: Benchmarking
PostPosted: Thu Nov 12, 2020 12:26 pm 
Offline
User avatar

Joined: Sat Dec 01, 2018 1:53 pm
Posts: 727
Location: Tokyo, Japan
litwr, would you mind either trimming your quotes to what you're replying to or just not quoting at all? Repeating large chunks of text that are already easily accessible in the conversation history wastes much more of your readers' time than you save, and it's even worse for those responding, who have to do yet more editing or compound your problem.

litwr wrote:
I just use Open MSX emulator with the Sanyo MPC-25FD.

Again, you're leaving out important information about how you created your benchmark results. If you're using an emulator, please say so; not all emulators are cycle-accurate, even when they claim to be.

I've now taken the program over to openMSX and replicated your result for an MSX2 Sony HB-F1XD as well as my results for for MSX1. (I as yet have not been able to run the code on a real MSX2 platform, though the materials are in that linked repo for anybody who does have one.) And the results list in that repo includes most or all of the results posted here; anybody interested in adding more should feel free to put in a merge request or ask me for commit access.

(OpenMSX doesn't emulate my Japanese Sony HB-55, which is why I used the European Sony HB-55P there instead. Also, openMSX does seem very consistent across versions and platforms; most of the Windows 10 openMSX v0.15 results in that list above were replicated exactly on Debian 9 openMSX v0.13.)

Quote:
I can only suppose that maybe MSX1 Basic is a bit faster than more complex MSX2 Basic.

Well, the core of MSX BASIC (which is all this program uses) is pretty much the same between both (at least as far as the user sees it), so that seems a bit weird that the newer version would be noticably slower, but that does appear to be the most likely cause, assuming that it's not an emulator issue. I wonder if it could be related to the video system, though even that's not so different (just a lot more capable in MSX2).

Quote:
The CPU frequency are the same in the MSX1 and MSX2. Maybe the observed 5% difference is also caused by small frequency difference for those computer models.

That seems less likely to me, since a difference that large could produce noticable timing problems in MSX1 games run on an MSX2 computer.

_________________
Curt J. Sampson - github.com/0cjs


Top
 Profile  
Reply with quote  
 Post subject: Re: Benchmarking
PostPosted: Thu Nov 12, 2020 1:51 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10949
Location: England
(There's a nearly 8% difference between PAL and NTSC versions of the NES and nearly 4% difference between the PAL and NTSC models of the C64.)


Top
 Profile  
Reply with quote  
 Post subject: Re: Benchmarking
PostPosted: Thu Nov 12, 2020 6:15 pm 
Offline

Joined: Sat Jul 09, 2016 6:01 pm
Posts: 180
Chromatix wrote:
A quick test program suggests the NC200's BBC BASIC uses a 32-bit mantissa in floating-point, which matches the 5-byte format used in standard BBC BASIC:


cjs wrote:
Again, you're leaving out important information about how you created your benchmark results. If you're using an emulator, please say so; not all emulators are cycle-accurate, even when they claim to be.

I've now taken the program over to openMSX and replicated your result for an MSX2 Sony HB-F1XD as well as my results for for MSX1. (I as yet have not been able to run the code on a real MSX2 platform, though the materials are in that linked repo for anybody who does have one.) And the results list in that repo includes most or all of the results posted here; anybody interested in adding more should feel free to put in a merge request or ask me for commit access.


So you have got almost the same results with the Sony HB-F1XD. ;) I have prepared more data for your project.

Code:
commodore 64/+4/128 4,1   32,+125,-128  (vice emu for the 64/128, plus4emu emu for the +4)
bbc micro           4,1   32,+126,-128  (b-em emu)
amstrad cpc         4,1   32,+126,-128  (ep128emu emu)
atari 800XL         d5,1  26,+325,-325  (atari800 emu)
MSX2                d7,1  44,+206,-212  (openmsx emu)


I have added data I got from Chromatix's program. It seems that it shows rather wrong information for the Atari 800 where mantissa is 10 decimal digits. It is also not easy to guest whether the information correct for the MSX2.

For the Sanyo MPC-25FD I used default settings so they are the same as for the Sony HB-F1XD.

IMHO b-em emu is not cycle exact but it is very close to this.

It is interesting that the BBC Micro, Commodore, Amstrad CPC use almost the same formats for FP numbers but they are not identical. Maybe the CPC and BBC Micro use the same number representations.

_________________
my blog about processors


Top
 Profile  
Reply with quote  
 Post subject: Re: Benchmarking
PostPosted: Fri Nov 13, 2020 3:17 am 
Offline
User avatar

Joined: Sat Dec 01, 2018 1:53 pm
Posts: 727
Location: Tokyo, Japan
BigEd wrote:
(There's a nearly 8% difference between PAL and NTSC versions of the NES and nearly 4% difference between the PAL and NTSC models of the C64.)

Yeah, this is going to be the case for systems where the video output clock is based on the system clock. The TMS9918-series video chips in MSX systems use a separate 10.738635 MHz clock and their own separate DRAM (which the CPU reads and writes via commands to the video chip) allowing NTSC and PAL systems to run at the same system bus frequency, which they do. But the MSX1 vs. MSX2 timing differences above were all on NTSC machines, anyway.

litwr wrote:
I have prepared more data for your project. I have added data I got from Chromatix's program.

Thanks! I've updated the benchmarks file with the simulator/emulator information from your post.

I've not included the information from Chromatix's program because it doesn't properly measure decimal formats (though it should give a pretty close approximation of the binary equivalants of decimal formats); I think it's better for this just to go with documentation of the actual formats (or reverse engineering via examining memory), rather than guessing. But it's probably possible to write a more sophisticated guessing program that can produce more accurate results; that's also probably best sent to another thread if anybody wants to pursue that seriously.

Also, again, since you surely translated Chromatix's program rather than running it directly, you should post the actual code you ran. It's easy to make mistakes in code like that that would invalidate the result.

Quote:
It seems that it shows rather wrong information for the Atari 800 where mantissa is 10 decimal digits. It is also not easy to guest whether the information correct for the MSX2.

The 44 bits it guesses for MSX2 is around 13.25 decimal digits (44 log₁₀(2)), so that seems about right. I'm not sure what's going on with the exponent, which seems more than 8 bits. I also just realized that I don't actually know what MSX is doing for its internal interpreter variable and FP register formats (these are sometimes different); I know only the program text storage format which is actually 8 or 9 bytes: the significand is always unsigned and an ASCII `-` prefixes negative numbers. (The exponent is an unsigned 8-bit integer biased by 128.)

For the Atari, there seem to be several versions of BASIC from at least two different authors/vendors. I've not dug too deeply into that mess, and I'm not likely to get around to it until I get real hardware. (I should have an Atari 800XL or something similar arriving at some point.)

_________________
Curt J. Sampson - github.com/0cjs


Top
 Profile  
Reply with quote  
 Post subject: Re: Benchmarking
PostPosted: Fri Nov 13, 2020 11:14 am 
Offline

Joined: Mon May 21, 2018 8:09 pm
Posts: 1462
For decimal formats, I would assume that the exponent indicates decimal digit shifts, rather than bit shifts. This would give about a 3x increase in representable dynamic range, which is consistent with what we see for the 800XL; it's only the MSX2 that does something weird. I actually didn't consider the possibility that a home computer's BASIC could use a decimal format.


Top
 Profile  
Reply with quote  
 Post subject: Re: Benchmarking
PostPosted: Fri Nov 13, 2020 1:19 pm 
Offline
User avatar

Joined: Tue Mar 02, 2004 8:55 am
Posts: 996
Location: Berkshire, UK
Chromatix wrote:
For decimal formats, I would assume that the exponent indicates decimal digit shifts, rather than bit shifts. This would give about a 3x increase in representable dynamic range, which is consistent with what we see for the 800XL; it's only the MSX2 that does something weird. I actually didn't consider the possibility that a home computer's BASIC could use a decimal format.

I suspect it will suffer from the same precision issues that IBM hex floating point numbers. When you shift the mantissa in addition/subtraction you lose a whole nybble rather than a single bit.
https://en.wikipedia.org/wiki/IBM_hexadecimal_floating_point#Precision_issues

_________________
Andrew Jacobs
6502 & PIC Stuff - http://www.obelisk.me.uk/
Cross-Platform 6502/65C02/65816 Macro Assembler - http://www.obelisk.me.uk/dev65/
Open Source Projects - https://github.com/andrew-jacobs


Top
 Profile  
Reply with quote  
 Post subject: Re: Benchmarking
PostPosted: Fri Nov 13, 2020 5:01 pm 
Offline
User avatar

Joined: Sat Dec 01, 2018 1:53 pm
Posts: 727
Location: Tokyo, Japan
Chromatix wrote:
For decimal formats, I would assume that the exponent indicates decimal digit shifts, rather than bit shifts.

In all common floating point formats that I'm aware of, the exponent is not a shift, it's simply the base-10 exponent. (I.e., increasing it by 1 multiplies the significand by 10, regardless of the base of the significand.) So typcially, it seems, "decimal" vs. "binary" floating point refers only to the base of the significand.

Quote:
...it's only the MSX2 that does something weird. I actually didn't consider the possibility that a home computer's BASIC could use a decimal format.

I wouldn't call this "weird"; it's not unusual for floating point formats to use a decimal significand. (IEEE 754 has had decimal significand formats from the start, for example.) And as far as microcomputers go, decimal significands are not only in MSX-BASIC (MSX 1 as well as 2); they seem to be common in early-80s MS BASICs. The Kyocera Kyotronic 85/Radio Shack Model 100 (like MSX, dating from 1983) used decimal significands. It would be interesting to look at NEC PC-8801 and IBM-PC BASIC to see if they had also switched.

_________________
Curt J. Sampson - github.com/0cjs


Top
 Profile  
Reply with quote  
 Post subject: Re: Benchmarking
PostPosted: Fri Nov 13, 2020 7:10 pm 
Offline

Joined: Sat Jul 09, 2016 6:01 pm
Posts: 180
cjs wrote:
Also, again, since you surely translated Chromatix's program rather than running it directly, you should post the actual code you ran. It's easy to make mistakes in code like that that would invalidate the result.

For the Atari, there seem to be several versions of BASIC from at least two different authors/vendors. I've not dug too deeply into that mess, and I'm not likely to get around to it until I get real hardware. (I should have an Atari 800XL or something similar arriving at some point.)


Nothing has been translated but lines 40 and 80. For the Commodore I used the next line 40
Code:
40 TRAP 70

For other systems I used
Code:
40 ON ERROR GOTO 70

I hope it will be easy to deduce line 80. ;)

The 8-bit Atari has FP routines in its OS ROM, not in Basic ROM. So they are quite standard. ROM Basics are also quite standard. Indeed there were several good cartridge Basics but I can't call them standard. I mentioned above Altirra Basic (2014) which can be used instead of standard Basic but it uses the same format for FP.

_________________
my blog about processors


Top
 Profile  
Reply with quote  
 Post subject: Re: Benchmarking
PostPosted: Fri Nov 13, 2020 7:13 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10949
Location: England
So, do we think MSX2 Basic runs slower than MSX Basic, but we don't know why?


Top
 Profile  
Reply with quote  
 Post subject: Re: Benchmarking
PostPosted: Sat Nov 14, 2020 1:09 am 
Offline
User avatar

Joined: Sat Dec 01, 2018 1:53 pm
Posts: 727
Location: Tokyo, Japan
BigEd wrote:
So, do we think MSX2 Basic runs slower than MSX Basic, but we don't know why?

Well, openMSX is a pretty darn good emulator, so I'm generally inclined to give it the benefit of the doubt, but for something like this, with no handy disassemblies of MSX-BASIC 1.0 vs. 2.0, I'd really like to see it run on a real machine to confirm that it really is slower.

litwr wrote:
cjs wrote:
Also, again, since you surely translated Chromatix's program rather than running it directly, you should post the actual code you ran.
Nothing has been translated but lines 40 and 80. For the Commodore I used....
Yeah, I'm having serious difficulty with this. I am pretty sure that CBM BASIC has no REPEAT...UNTIL syntax, and VICE's C64 simulation appears to agree with me:
Attachment:
c64-repeat-until.png
c64-repeat-until.png [ 12.24 KiB | Viewed 1202 times ]

Quote:
I hope it will be easy to deduce line 80. ;)
Well, why "hope" when you can simply post the code you used? And ideally, provide a script that runs the emulator with that code. Even better yet, put it in a repo so that others can clone it and quickly reproduce your results. Unless, of course, you want to make a lot more work for people trying to do the reproduction.

Quote:
The 8-bit Atari has FP routines in its OS ROM, not in Basic ROM. So they are quite standard.
Actually, those routines are apparently not really "system ROM" but just moved into the OS ROM because they ran out of space in BASIC ROM:
Wikipedia wrote:
What became Atari BASIC is a pared-down version of Cromemco BASIC ported to the 6502. That needed 10K of code. To make it fit in Atari's 8K cartridge, some of common routines were moved to the operating system ROMs. This included 1780 bytes for floating point support that were placed in a separate 2K ROM on the motherboard.
I don't think this is uncommon. You see the same thing in the Panasonic JR-200 ROM, where they even moved all the BASIC error messages into the BIOS ROM.

_________________
Curt J. Sampson - github.com/0cjs


Top
 Profile  
Reply with quote  
 Post subject: Re: Benchmarking
PostPosted: Sat Nov 14, 2020 6:07 am 
Offline

Joined: Sat Jul 09, 2016 6:01 pm
Posts: 180
cjs wrote:
Yeah, I'm having serious difficulty with this. I am pretty sure that CBM BASIC has no REPEAT...UNTIL syntax, and VICE's C64 simulation appears to agree with me:
Well, why "hope" when you can simply post the code you used? And ideally, provide a script that runs the emulator with that code. Even better yet, put it in a repo so that others can clone it and quickly reproduce your results. Unless, of course, you want to make a lot more work for people trying to do the reproduction.

I am really sorry. I have made same simple changes mechanically like an automaton and forgot about them. :(
Code:
10 c=0 : t=1 : q=0.5
20 c=c+1 : q=q/2 : s=t+q : if s<>t then 20
30 print "mantissa bits: ";c
40 trap 70
50 c=0 : t=2
60 c=c+1 : q=t : t=t*2 : if q<>t then 60
70 print "max exponent: +";c
80 trap 110
90 c=0 : t=0.5
100 c=c+1 : t=t/2 : if t<>0 then 100
110 print "min exponent: -";c

I think that to publish a lot of code is not a good idea. Especially if the code is almost the same as published above.
BTW the C+4/128 support DO LOOP WHILE UNTIL syntax.

_________________
my blog about processors


Top
 Profile  
Reply with quote  
 Post subject: Re: Benchmarking
PostPosted: Sat Nov 14, 2020 3:22 pm 
Offline

Joined: Sat Jul 09, 2016 6:01 pm
Posts: 180
I have just added data for the most popular Soviet home computer BK0010-01. It has the PDP-11 compatible processor. This computer is generally slower than the ZX Spectrum but its Basic (released in 1986) was one of the world fastest, it uses a kind of precompilation.
The default FP format is 8 bytes, but it is possible to use single precision values which require only 4 bytes. Chromatix program shows 56 bit for mantissa, and +126, -128 for exponent. So we have 7,1 for cjk table.

_________________
my blog about processors


Last edited by litwr on Sun Nov 15, 2020 1:01 pm, edited 1 time in total.

Top
 Profile  
Reply with quote  
 Post subject: Re: Benchmarking
PostPosted: Sat Nov 14, 2020 4:26 pm 
Offline

Joined: Thu Mar 12, 2020 10:04 pm
Posts: 702
Location: North Tejas
litwr wrote:
I think that to publish a lot of code is not a good idea.


I have to disagree.

When it comes to benchmarking, it is vital that someone else can reproduce your result.


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 54 posts ]  Go to page Previous  1, 2, 3, 4  Next

All times are UTC


Who is online

Users browsing this forum: No registered users and 2 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: