Page 3 of 4

Re: Benchmarking

Posted: Sat Nov 14, 2020 6:38 pm
by cjs
BillG wrote:
litwr wrote:
I think that to publish a lot of code is not a good idea.
I have to disagree.
When it comes to benchmarking, it is vital that someone else can reproduce your result.
Yup. I feel the same. I do, however, think it's perfectly reasonable to put your code in a Git or similar repo and post that somewhere (GitHub and GitLab will host it for free, but you can even host a Git repo on static HTTP) to avoid cluttering the thread. (Not to mention which that makes it easier for someone to get the latest version of the full set of programs and to track any changes to the code. Especially if you commit the unmodified version first and then your changes to it as a separate commit.)

I'm happy to give out commit access to the bascode repo on GitLab to anybody who wants to put their stuff there, or integrate merge requests.

Re: Benchmarking

Posted: Sun Nov 15, 2020 7:54 am
by litwr
cjs wrote:
I'm happy to give out commit access to the bascode repo on GitLab to anybody who wants to put their stuff there, or integrate merge requests.
Thank you. I have cloned your repo. My login is vollitwr.

Re: Benchmarking

Posted: Tue Nov 17, 2020 6:58 am
by leepivonka
Here is a FORTH version.
Running 65816S FORTH on a 65816 in a simulator.

Code: Select all

cr mandelfb2
.......,,,,,,,,,''''''''''''''',,,,,,,,
......,,,,,'''''''''''''~~~===~~''',,,,
.....,,,'''''''''''''~~~~==+&;/=~~~''',
....,,''''''''''''~~~~~~==:[  ;+==~~~''
...,,'''''''''''~~~~~==++:[    [:===~~'
..,'''''''''''~~~~==+/ </       o/;;&=~
..''''''''''~=====++:[              <+=
.,'''''~~~=?+++++::;&               <;=
.''~~~~~==+;?#& o/[<                 [+
.~~~~~===:;O                         ;=
.==+:++:[&                          [+=
.==+:++:[&                          [+=
.~~~~~===:;O                         ;=
.''~~~~~==+;?#& o/[<                 [+
.,'''''~~~=?+++++::;&               <;=
..''''''''''~=====++:[              <+=
..,'''''''''''~~~~==+/ </       o/;;&=~
...,,'''''''''''~~~~~==++:[    [:===~~'
....,,''''''''''''~~~~~~==:[  ;+==~~~''
.....,,,'''''''''''''~~~~==+&;/=~~~''',
......,,,,,'''''''''''''~~~===~~''',,,,
.......,,,,,,,,,''''''''''''''',,,,,,,,

54122384 cycles         \ 27.06sec at 2MHz
 ok

Re: Benchmarking

Posted: Thu Dec 31, 2020 5:06 pm
by faicuai
On actual Atari 800 / XL hardware, running Altirra Basic 1.57 (8KB) and Altirra 3.28 FP rom (2KB, part of OS-rom), exactly fittimg originally address space, thus being a byte-for-byte replacement (no need to burn them, as Atari XL's MMU fully supports soft-loading these packages on stock HW from any SIO-attached disk):
8801CB97-7C3A-4FED-AC3C-E82C47F22BBC.jpeg
EC7E93E3-730E-4068-B746-DEDB38891260.jpeg

Slightly re-arranged code to shorten the lines-list (chops 4 secs. from original 110.33 secs run).

Output running on XEP80 80-cols display ('Console-mode") enabling 6502 to run at full 1.79Mhz tilt, minus OS overhead. Exactly same as stock machine by simply turning Antic's DMA operation OFF and then ON via OS vector (559=0, or 559=34, decimal)

We have not even tried TurboBasic XL v1.5 (interpreted) or FastBasic v4.0, which will blow these times out-of-the-water.

Re: Benchmarking

Posted: Thu Dec 31, 2020 5:09 pm
by BigEd
(Welcome!)

Re: Benchmarking

Posted: Thu Dec 31, 2020 5:15 pm
by faicuai
BigEd wrote:
(Welcome!)
Thanks !!!

Happy new 2021 and best wishes for everyone!

Re: Benchmarking

Posted: Mon Jan 04, 2021 2:43 pm
by mikeblas
BillG wrote:
I have to disagree.

When it comes to benchmarking, it is vital that someone else can reproduce your result.
Seconded. While the benechmarks and results presented here are certainly fun, I wonder what's really being measured. Machine-to-machine speed is lost, since the performance of a BASIC program is also dependent on the performance of the BASIC interpreter.

And the programs need to be modified just to work, since the hardware (and BASIC) isn't identical. But it also probably leaves performance on the table -- adjusting the code to take advantage of features of each platform isn't done (or is it?) and means that each examined platform might end up being faster than the benchmark allows itself to demonstrate.

Re: Benchmarking

Posted: Fri Feb 26, 2021 10:32 am
by litwr
I have just got data for a very unusual computer. It is about the TI-99/4A. It was the first 16-bit home computer though some people think that the first true 16-bit home computer was https://en.wikipedia.org/wiki/Electronika_BK
The TI-99 has only 256 bytes of fast 16-bit RAM. The other 32KB of RAM is an optional expansion that is only available via the 8-bit bus.
The TMS9900 is the processor for the TI-99. Its architecture is memory mapped and therefore resembles the 6502. The TMS9900 can easily do register context switching and this resembles the Z80. The TMS9900 has 16 GPR and this resembles the ARM. The TMS9900 ISA is almost orthogonal and this resembles the DEC PDP-11. This processor uses +5, -5, +12V and this resembles the 8080. The TMS9900 has also several unique traits like unusual arithmetic flags.
The TI-99 Basic screen has 24 rows and 28 columns. Actually there are 32 columns on Basic screen but it is possible to use only 28 of them for PRINT. It seems that some ancient TV sets couldn't show characters on the screen edges properly. The TI-99 has a 40x24 video mode but Basic doesn't support it.
The software system timer on the TI-99 is only 8-bit wide. It is a very odd for a 16-bit system. This oddity can be explained by the fact that the timer has poor accuracy, because the system disables the timer every time the video memory is accessed. So I had to implement a wider timer and make its correction.
The sources are available on https://gitlab.com/retroabandon/bascode
mandel-ti99.png
mandel-ti99.png (3.47 KiB) Viewed 1474 times

Re: Benchmarking

Posted: Sat Feb 27, 2021 7:57 pm
by BillG
litwr wrote:
I have just got data for a very unusual computer. It is about the TI-99/4A. It was the first 16-bit home computer though some people think that the first true 16-bit home computer was https://en.wikipedia.org/wiki/Electronika_BK
The TI-99 has only 256 bytes of fast 16-bit RAM. The other 32KB of RAM is an optional expansion that is only available via the 8-bit bus.
The TMS9900 is the processor for the TI-99. Its architecture is memory mapped and therefore resembles the 6502. The TMS9900 can easily do register context switching and this resembles the Z80. The TMS9900 has 16 GPR and this resembles the ARM. The TMS9900 ISA is almost orthogonal and this resembles the DEC PDP-11. This processor uses +5, -5, +12V and this resembles the 8080. The TMS9900 has also several unique traits like unusual arithmetic flags.
The TI 99/4A has a couple of serious strikes against it.

1. While the processor is clocked at 3 MHz, most instructions require an unexpectedly large number of clock cycles. It is somewhat like the 8080/Z80 in that a memory access takes three ticks.

2. Unfortunately, the 99/4A mostly uses 8-bit memory so accesses take twice as long as a 16-bit access is broken down into two 8-bit accesses. More unfortunately, many instructions are read/modify/write, so that penalty is very heavy.

I try to wrap my head around it here:

https://atariage.com/forums/topic/31123 ... nt=4633569

Re: Benchmarking

Posted: Tue Mar 02, 2021 4:06 pm
by litwr
BillG wrote:
The TI 99/4A has a couple of serious strikes against it.

1. While the processor is clocked at 3 MHz, most instructions require an unexpectedly large number of clock cycles. It is somewhat like the 8080/Z80 in that a memory access takes three ticks.

2. Unfortunately, the 99/4A mostly uses 8-bit memory so accesses take twice as long as a 16-bit access is broken down into two 8-bit accesses. More unfortunately, many instructions are read/modify/write, so that penalty is very heavy.

I try to wrap my head around it here:
https://atariage.com/forums/topic/31123 ... nt=4633569
My results for π spigot show that the TMS9900@3MHz beats the Z80@6MHz, 6502@4MHz, and even VAX-11/730! So it is rather a fast processor which matches the 8088. Of course, this is mainly a consequence of the presence of hardware division and multiplication on the TMS9900. It is an irony that a company which invented the IC, first processor, and first electronic calculator had to use the Z80 in their calculators since the 80s. My blog entry about the TMS9900 is here.

Re: Benchmarking

Posted: Fri Sep 17, 2021 10:33 am
by BillG
mikeblas wrote:
BillG wrote:
I have to disagree.

When it comes to benchmarking, it is vital that someone else can reproduce your result.
Seconded. While the benechmarks and results presented here are certainly fun, I wonder what's really being measured. Machine-to-machine speed is lost, since the performance of a BASIC program is also dependent on the performance of the BASIC interpreter.

And the programs need to be modified just to work, since the hardware (and BASIC) isn't identical. But it also probably leaves performance on the table -- adjusting the code to take advantage of features of each platform isn't done (or is it?) and means that each examined platform might end up being faster than the benchmark allows itself to demonstrate.
That plus the fact the the Mandelbrot benchmark, as currently written, includes the time needed to display the result. Display speed varies wildly from platform to platform.

A more accurate test can be had by creating the output in a string or array, then displaying it. The time to build the output may be more interesting for some people while others may care about the display time or the total time.

Re: Benchmarking

Posted: Fri Sep 17, 2021 11:24 am
by drogon
BillG wrote:
mikeblas wrote:
BillG wrote:
I have to disagree.

When it comes to benchmarking, it is vital that someone else can reproduce your result.
Seconded. While the benechmarks and results presented here are certainly fun, I wonder what's really being measured. Machine-to-machine speed is lost, since the performance of a BASIC program is also dependent on the performance of the BASIC interpreter.

And the programs need to be modified just to work, since the hardware (and BASIC) isn't identical. But it also probably leaves performance on the table -- adjusting the code to take advantage of features of each platform isn't done (or is it?) and means that each examined platform might end up being faster than the benchmark allows itself to demonstrate.
That plus the fact the the Mandelbrot benchmark, as currently written, includes the time needed to display the result. Display speed varies wildly from platform to platform.

A more accurate test can be had by creating the output in a string or array, then displaying it. The time to build the output may be more interesting for some people while others may care about the display time or the total time.
As the author of this thread and the Mandelbrot code in it, I would have to suggest using caution here....

It is true that output time can affect things, however in this case the millions of floating point calculations by far outweigh the output speed (or lack of) any terminal (OK, lets not think about using a TTY33 here!)

Test runs with my code under BBC BASIC 4 on my 16Mhz Ruby: With printing: 48.24 seconds. Without printing 47.26. A difference of 0.98 seconds or maybe 2% faster... So while measurable, that's not significant in the times we're talking about here and my suspicion is that doing something like: O$(L) = O$(L) + MID$(...) would take longer than simple PRINT MID$(...

OK - I'll test it.... 49.28 seconds. Which has surprised me as I thought storing with string concatenation might be slower but in brief, it's about a second either way. Storing and printing later added a second to a 48 seconds run, and not printing saved a second.

Someone else can try this under EhBASIC, etc. ...

But again, in all cases do be careful what you're benchmarking - this is all about floating point performance. I deliberately wrote it to try to make it source-code (at the textual level) compatible with generic BASICs, so MS style BASICs, BBC Basic and my own RTB Basic. There is no point in, say, re-writing this in a system with scaled Integers then saying - look how much faster this is because you're really not comparing like for like.

With regard to printing, my RubyOS tries to line-buffer output as the transaction between the 6502/816 and the host MCU has a rather high latency, so there is a 128 byte output buffer that's flushed when a newline is printed or input is requested. One thing that came up on another forum (stardot) was the time taken to check for Control-C, (relatively high on Ruby under EhBASIC and CBM2) so it's sometimes worthwhile turning it off, but again, you're falling into the trap of optimising a benchmark for your own platform.

(RubyOS has the same concept of "Escape" as the Acorn MOS which handled it asynchronously via interrupt and the test for escape is to simply check bit 7 of $FF which is very fast and efficient)

Maybe it's time to re-write or just re-publish those old PCW benchmarks, but then again, that's been done to death many times over the past few decades and we're not going to learn anything new - old MS style Basics are slower than newer (e.g.) BBC Basic because the newer Basics were written by some very clever university boffins who had the benefit of using those older systems, reeling back in horror and saying "Hold my Beer" ...

-Gordon

Re: Benchmarking

Posted: Mon Sep 20, 2021 2:12 pm
by litwr
BillG wrote:
That plus the fact the the Mandelbrot benchmark, as currently written, includes the time needed to display the result. Display speed varies wildly from platform to platform.

A more accurate test can be had by creating the output in a string or array, then displaying it. The time to build the output may be more interesting for some people while others may care about the display time or the total time.
IMHO some people forget that the perfection is an enemy of real and good things. The perfection is impossible and good things are imperfect. What difference does it make if results may vary by 1-2% because of the screen output? The same difference you can get just running a program on hardware in different temperature environments.

IMHO some people just don't want that their systems can be compared with others...

Re: Benchmarking

Posted: Mon Sep 20, 2021 2:24 pm
by drogon
litwr wrote:
IMHO some people just don't want that their systems can be compared with others...
Well, indeed. Unless you're a saleman for a supercomputer company, then you persuade the engineers to fiddle everything to get that sale...

(BTDT - as one of the engineers)

-Gordon

Re: Benchmarking

Posted: Fri Sep 24, 2021 2:12 am
by BillG
drogon wrote:
It is true that output time can affect things, however in this case the millions of floating point calculations by far outweigh the output speed (or lack of) any terminal (OK, lets not think about using a TTY33 here!)

Test runs with my code under BBC BASIC 4 on my 16Mhz Ruby: With printing: 48.24 seconds. Without printing 47.26. A difference of 0.98 seconds or maybe 2% faster... So while measurable, that's not significant in the times we're talking about here and my suspicion is that doing something like: O$(L) = O$(L) + MID$(...) would take longer than simple PRINT MID$(...

OK - I'll test it.... 49.28 seconds. Which has surprised me as I thought storing with string concatenation might be slower but in brief, it's about a second either way. Storing and printing later added a second to a 48 seconds run, and not printing saved a second.

Someone else can try this under EhBASIC, etc. ...

But again, in all cases do be careful what you're benchmarking - this is all about floating point performance. I deliberately wrote it to try to make it source-code (at the textual level) compatible with generic BASICs, so MS style BASICs, BBC Basic and my own RTB Basic.
To be honest, I have not personally run your benchmark so have no experience with it.

I have little interest in benchmarking BASIC interpreters and probably will not until floating point is implemented in my compiler.