6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Sat Nov 23, 2024 2:13 pm

All times are UTC




Post new topic Reply to topic  [ 84 posts ]  Go to page Previous  1, 2, 3, 4, 5, 6  Next
Author Message
PostPosted: Wed Sep 16, 2020 3:17 pm 
Offline
User avatar

Joined: Sun Jun 30, 2013 10:26 pm
Posts: 1949
Location: Sacramento, CA, USA
Sorry, Ed. I accept partial responsibility for squirting the lighter fluid. I had a clear opportunity to move my post to my "efficient sign extension" thread and missed it.

_________________
Got a kilobyte lying fallow in your 65xx's memory map? Sprinkle some VTL02C on it and see how it grows on you!

Mike B. (about me) (learning how to github)


Top
 Profile  
Reply with quote  
PostPosted: Wed Sep 16, 2020 3:39 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10986
Location: England
Not to worry: my comment wasn't pointed, it just seemed that BillG's explorations were out of place, and I'd got in mind there was some other recently active thread which he might have intended to continue.

I hadn't seen litwr's article as being primarily a technical critique, more of a personal take on history. It's one of a series about various architectures.


Top
 Profile  
Reply with quote  
PostPosted: Wed Sep 16, 2020 3:57 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10986
Location: England
(I should perhaps add, I'm all in favour of exploring the relative excellence of different microprocessors, preferably by applying equally expert programming knowledge to each of them. (In fact, litwr has himself produced and published a fair number of implementations of a pi program, which I'm sure we've discussed here. (We have - see for example here.)))


Top
 Profile  
Reply with quote  
PostPosted: Fri Sep 18, 2020 5:17 pm 
Offline

Joined: Sat Jul 09, 2016 6:01 pm
Posts: 180
Some people have blamed me that I used rather wrong information about the 6502:6800 performance ratio. I have to try and justify myself. There are several links which refer to official MOS claims
http://www.cpu-collection.de/?l0=co&l1=MOS
https://archive.archaeology.org/1107/fe ... p_cpu.html
I have also found out a very interesting the 6800 and 6502 comparison
https://archive.org/details/byte-magazi ... 7/mode/2up
where several code samples clearly show that the 6502 is 2 times faster than the 6800. And I have claimed that the 6502 only up to 2 times faster.
Indeed, it would be really interesting to get EDN magazines from 1975 but it seems very difficult. I can only hope that one day those precious magazines will become available on the net. We can only know that there were AH Systems benchmarks for the 8080, 6800 and 6502...
I was thinking about several different kinds of data processing and I always got that the 6502 should be about 2 times faster. Maybe we need to open a special thread "6502 vs 6800"? IMHO we should use real algorithms for applications. Reentrant or not self-modifying code has used rather rarely, only for interrupt handlers and ROM respectively...
Sorry my post is a large one and the next will follow.
brain wrote:
More concerning, calling the 6800 a mediocre processor is useless without some actual data to back it up.

Thank you very much again for your highly interesting comments.
I used words "rather mediocre" which give me some room for maneuvers. :) And BTW I have checked a dictionary and found out the next definition
Quote:
The roots of the adjective mediocre are from the Latin medial, "middle," and ocris, "mountain." If you think about it, the middle of a mountain is neither up nor down and neither here nor there — just somewhere in between. The definition of mediocre is "of ordinary quality," "merely adequate," and "average.

It is nothing offensive in this word, it means just satisfactory, almost quite well, but not outstanding or bad.

brain wrote:
Case in point, the TMS9900 16 bit processor in the TI 99/4A is hardly a "mediocre" processor. But, the performance of the TI 99/4A suffered greatly due to system design constraints that strangled the poor processor's capabilities.

Sorry I haven't understood your point. Would you like to clarify it? You can check my material about the TMS9900, I have never used word "mediocre" there - https://litwr.livejournal.com/1575.html - however I mentioned several weak points of this processor.

brain wrote:
The 6502 crowd can write some sample apps (sorting app or sieve or something else) and the 6800 crowd on the SWTPC or the MC10 could be asked to write the same thing. Results could then be compared. If you take out IO and such, the comparison would be pretty valid.

I can repeat, it sounds interesting for me. However, the MC10 uses the 6801 which is much more powerful that the 6800. There were no popular computers based on the 6800.

brain wrote:
The 6800 will be better at handling tabular data greater than 1 page, given the 16 bit index register, while the 6502 will do better on loops and smaller tables due to the 2 index registers.

I can't agree. IMHO the 6502 is much better with processing of any tabular data. Let us make a simple routine which makes a simple checksum by XOR-ing of 1024 bytes.
Code:
     ldy #0
     ldx #4
     lda #hi(tab)
     sta loop+2
     tya
loop eor loop,y
     iny
     bne loop

     inc loop+2
     dex
     bne loop

     align 256
tab .byte <data> ;1000 bytes are here

I am sure that the analogous 6800 routine will be 2-4 times slower.

brain wrote:
we know he got seed money from MOS/CSG to stand up WDC.

It sounds unusual for me because CSG never used the CMOS 6502.

brain wrote:
It was not made because CSG was cheap, and would have had to retool to add those instructions into their layouts, which they had no reason to do so.

IMHO this also means that for CSG the 65C02 new instructions had little importance. However you are rather right, Commodore almost didn't spend any money on technology improvement.

brain wrote:
Saying Bill did not improve the NMOS 6502 seems highly editorial to me. He improved it by moving it to CMOS.

This is exactly what I wrote about...

brain wrote:
While the information on the C128 is not wrong, it suggests it's the CPU's fault. The problem with the C128 was not the CPU, but the fact that there was little demand for such a 65XX unit, and most developers could drive more margins by targeting the 64, which the C128 could also emulate.

Please could you explain why is it possible to think that I wrote about the CPU's fault? I wrote about an unfriendly environment around the 6502 systems at 2 MHz in the USA until the second half of the 80s.

brain wrote:
I also don't see how the code for the 65C02 would turn out to be "more cumbersome..."

We discussed one my example earlier on this forum but let me repeat. I uses a 256-byte table with 128 jump vectors. Those vectors must have odd addresses. On the NMOS 6502 I pack such a table in a single page but on the CMOS 6502 I get a one byte ugly displacement.

brain wrote:
The only reference I have on the arkanoid bug is around the NES version of Arkanoid

Sorry I meant Asteroid game for an Atari computer. Bill told about it - https://www.youtube.com/watch?v=7YoolSA ... be&t=28230
It is strange I could not google this information and I am sure that it was written somewhere on the net. Internet really has viruses which are eating valuable information. :(

brain wrote:
Many people know that Bill has a 32 bit version of the '02 designed, which was called Terbium. The market never asked for it, so he never produced it. The author seems to be under the delusion that companies will just create optimized versions of their designs when no one has asked for them, using personal cash to fund an effort which may never get used. Bill puts designs into production when people come with cash to buy them, not before. Same with Zilog and Intel. If we want to blame someone, blame the market, not Bill. I am sure he wanted to do the 32 bit '02, because he wrote the design up. But, a suitable customer never materialized.

It is very interesting but it proves my point that Bill couldn't replace a company. He is a genuine engineer but he needed also a marketing specialists who could provide sells and get more customers. Some assistance from other engineers could help too.

brain wrote:
I'll admit it's tough to slog through the article with the heavy editorial bias. If the author wants people to read the article, I think it'd be best to put the facts first in the main body of the article, and then put the editorialization at the end.

Maybe word bias is not right for this case? I really don't have a special personal opinion about the 6502 or other processors. In a sense I like them all but I also like to examine them carefully and find their drawbacks. Maybe the main problem is my English because in Russian my material has about 7 times greater popularity.
I still not understand your point about my phrase "the 6502 was only microscopically improved and made artificially partially incompatible with itself" in relation to the 4510. This phrase relates to the 65C02 only. Information about the 4510 follows much later.

brain wrote:
Um, bugs can indeed be documented and still be bugs. All the CPUs I use have a long list of errata on them and people do indeed consider them bugs. Documenting something does not absolve it of guilt.

It is a very slippy ground. IMHO if we have a documented specification and its later changes then we have rather concept changing than a bug...

brain wrote:
Currently, the article still says "...he never tried to improve this processor himself." It's just wrong. Fagin improved the 8080 by designing the Z80. The same is true of Bill.

My phrase states "It has turned out that Bill worked on the 6502, with only specifications received, and he never tried to improve this processor himself". Maybe it is something wrong with my English but I want to say exactly the same things you have claimed that Bill Mensch just followed the market, he made things only in response to the market demands. Maybe I should use phrase "he never took the initiative to improve this processor"?
IMHO Faggin created rather a completely new processor which has compatibility with the 8080. He designed its ISA, implemented a technological process to make the Z80, he also participated marketing of the Z80. Bill Mensch got the 6502's ISA ready from Chuck. He had a great role in the NMOS 6502 implementation and he was the only designer of the CMOS 6502. He made the great work in field of electronics but the ISA design and marketing were not his fields.
Bill said good words that the NMOS 6502 changed the world but the CMOS 6502 were produced in larger volumes. IMHO the existence of this forum and the presence of us here is due to the NMOS 6502, the CMOS 6502 just exploited the success of its very successful variant.

brain wrote:
CSG had faster 6502s, but making the interleaved dual processor design forced all parts to be twice as fast as normal, and CBM was not about to pay for sub 100nS DRAM to move the C128 to 3 or 4MHz (which demanded 6 or 8MHz capable DRAM), and they definitely could not switch from an interleaved memory design, as the 64 mode forced constraints. Commodore was cheap. It had nothing do with keeping the CPU speed down. It had to do with cost. And, I'm not sure what you mean about the poor C128 design. Bil Herd would challenge that notion a lot :-). I'm not a huge fan of the 128, but I don't see any glaring design issues. The Z80 looks shoehorned in because it was. Marketing demanded the unit be CP/M compatible, thinking the 64 mode could run the old C64 CP/M cart, but that cart was never robust, didn't work on newer 64s, and Bil got so frustrated trying to make it work he ended up just pushing the design into the C128 to check off the requirement. If there's a design issue, it'd be there.

Bil Herd and Commodore got a great commercial success with the C128, so I hope our theoretical discussion can't harm anyone. I don't understand why you write about 3-4 MHz. They didn't even actually provide 2 MHz. Indeed they provided 2 MHz for the 6502 but rather theoretically, I wrote about this. IMHO Commodore just heavily and rude exploited people expectations about an upgraded C64 instead of making a real upgrade. They just gave the C64 with better but slower Basic and a bunch of incompatible modes which require special hardware to get benefits from them. The Z80 at the actual 1.6 MHz looked rather very poorly in 1985.

brain wrote:
He's a pretty smart fellow, you should go chat with him.

If he reads this thread I must express my great and sincere respect to him here. I also hope that he will tell us more about the 6502 history details someday.

brain wrote:
Yeah, I saw you made that comment before. But, I looked at the ad, and I don't see any hint of 16 bits for a CPU. Can you point out where you see it? "The first of a low cost high performance microprocessor family" cannot be it, as that could mean so many other things.

It has phrase
Quote:
Thus, MOS Technology has left holes in the 650X instruction bit pattern to accommodate a "quasi-16-bit machine."
;)

brain wrote:
Now, if you want to argue the CPU should have required a 48 pin DIP carrier and brought out the 16 bits of the data bus directly, and had an internal 16 bit ALU, there's probably more weight to that argument.

The 16-bit data bus is a quite right idea for me but I don't understand your words about 16-bit ALU. Doesn't the 65816 have 16-bit ALU?! It is shocking for me.

1024MAK wrote:
one thing that is more important than features alone, especially in the 1970s and 1980s, was getting the price right (as in low) with just enough features so that it sells.

Thank you very much for your remarks and an interesting link. IMHO prices are very difficult matter for correct conclusions. The Apple II was always rather an expensive computer. The Apple Mac was also not among the cheap computers... There is always market for very expensive designs.

1024MAK wrote:
look up the dates for the 68000 and the 8086 and then look at when the first 16/32 bit computers were first sold

Sorry I missed the idea. IMHO the computers used the mentioned processor appeared almost immediately, like the KIM-1 for the 6502.

1024MAK wrote:
At the same time, most of the computers that these chips (6502, Z80) are being used in, are undergoing cost reduction measures by the manufacturers so that the older model of computers can stay price competitive. Not a good market for a new more expensive eight bit microprocessor...

IMHO the history showed that people are always happy to get better hardware which implies better software. However this somehow missed 8-bit computers. 16/32-bit designs like the IBM PC or Apple Macintosh were always evolving. 8-bit designs could have been more evolving too. It is strange for me that Apple stopped the Apple III, Commodore made a strange C128, the BBC Micro was too expensive, etc.
If the 6502 and Z80 were used only as low power controllers we haven't discussed them now. There are a lot of controllers, they works but they don't have history.

1024MAK wrote:
Acorn did design and manufacturer an export version for sale in the U.S.A. As far as I know, they were not blocked or locked out as such.

Maybe I missed something but how could they design and manufacturer an export version for sale in the U.S.A without some market researches? So they had some information that they could sell their computers their. IMHO it could be education. The BBC Micro was more advanced than the Apple II in many ways and the price of the Apple II could be even higher than for the BBC Micro. So it was rather a block or lock around computers for educational purpose in the USA. I know that some people tried to start selling the Amigas there but it was impossible. Compare capability of the Amiga and Apple II...

_________________
my blog about processors


Top
 Profile  
Reply with quote  
PostPosted: Fri Sep 18, 2020 5:25 pm 
Offline

Joined: Sat Jul 09, 2016 6:01 pm
Posts: 180
I have just upgraded my blog material about the 6502 - https://litwr.livejournal.com/2773.html
I have added the next texts. Thanks for the help!
Quote:
Synertek and Rockwell companies in addition to the production of the CMOS 6502, also continued to produce the NMOS 6502.

Even if you compare the improvements made in the Motorola 6801 over the 6800 or the Intel 8085 over the 8080, they are gigantic compared to those made in the 65C02, and Intel and Motorola made them much earlier.

But it must be admitted that several new instructions turned out to be expected and useful, for instance new addressing modes for BIT or instruction JMP (ABS,X).

Although again, we must admit that the new instructions allow you to get slightly faster and more compact codes. Besides that, four relatively rare instructions became sometimes a clock cycle faster on the 65C02. Additionally, the 65C02 became to reset the decimal mode flag on interrupt, which allows an interrupt handler to be 2 cycles faster and 1 byte shorter - this tiny improvement illustrates the overall amount of improvements made in the 65C02.

It is worth noting, of course, that WDC was able to create a CMOS processor only a few years after Intel and Motorola made CMOS versions of their 8080 and 680x processors. In this, it was significantly ahead of Zilog, where the CMOS version of the Z80 was created only by 1987. However, if the CMOS 8085 and Z80 immediately found the wide use in mobile computers, the low power consumption 65C02 found its application in computers relatively late, I can only name the Atari Lynx game console, produced since 1989.

The 4510 chip was based on the 65CE02 processor, which in turn is based on the WDC 65C02.

The 6502 uses a simple instruction pipeline that speeds up the execution time of many instructions by 1 clock cycle.

But software interrupts in the 6502 are implemented quite primitively: they use the address for masked interrupts, which requires a rather cumbersome additional software check to distinguish them.

Due to the fact that the ability to handle software interrupts significantly slows down the processing of hardware interrupts, support for software interrupts is often simply not implemented.

Such dual-processor systems were extremely rare. As an example of such systems, I know of only a few very rare models of Commodore drives. Instead of the second processor a video controller was usually used, which shared memory with the 6502.

In addition, the 65816 was one of the first 16-bit processors manufactured using CMOS technology!

Bill Mensch was able to provide some support for the development of 6502. However, the capabilities of one person to support the competitiveness of a processor is clearly not enough. Bill, as an excellent electronics engineer, was able to provide support for the execution of orders for the 6502 upgrade, but ensuring the independent development of a successful processor required a team. Someone had to develop an upgrade of the instruction system, someone had to develop new marketing strategies, etc. In addition, at least 1976-78 years were lost for development, and one person was no longer able to catch up. In a sense, WDC created an illusion of well-being around the 6502 development situation and this had a rather negative effect on the real development.

_________________
my blog about processors


Top
 Profile  
Reply with quote  
PostPosted: Sat Sep 19, 2020 1:16 am 
Offline

Joined: Mon Sep 17, 2018 2:39 am
Posts: 138
Hi!

litwr wrote:
Some people have blamed me that I used rather wrong information about the 6502:6800 performance ratio. I have to try and justify myself. There are several links which refer to official MOS claims
http://www.cpu-collection.de/?l0=co&l1=MOS
https://archive.archaeology.org/1107/fe ... p_cpu.html
I have also found out a very interesting the 6800 and 6502 comparison
https://archive.org/details/byte-magazi ... 7/mode/2up
where several code samples clearly show that the 6502 is 2 times faster than the 6800. And I have claimed that the 6502 only up to 2 times faster.


Except that the BYTE article actually contradicts your conclusions, it says:
Quote:
Which processor comes out ahead overall? To a great extent it depends on your point of view: Systems programs are better on the MOS Technology achines; applications programs would tend to come out ahead on the Motorola 6800.


Quote:
I can't agree. IMHO the 6502 is much better with processing of any tabular data. Let us make a simple routine which makes a simple checksum by XOR-ing of 1024 bytes.
Code:
     ldy #0
     ldx #4
     lda #hi(tab)
     sta loop+2
     tya
loop eor loop,y
     iny
     bne loop

     inc loop+2
     dex
     bne loop

     align 256
tab .byte <data> ;1000 bytes are here

I am sure that the analogous 6800 routine will be 2-4 times slower.

But you are using self-modifying code, that is not possible when having ROM code and also not possible to generate from a compiler. So, your comparison is not really good, no 6502 beginner would write that code.

That said, I do agree that for hand-written code, the 6502 is more efficient, as you can use the extra address modes in creative ways.

Quote:
brain wrote:
I also don't see how the code for the 65C02 would turn out to be "more cumbersome..."

We discussed one my example earlier on this forum but let me repeat. I uses a 256-byte table with 128 jump vectors. Those vectors must have odd addresses. On the NMOS 6502 I pack such a table in a single page but on the CMOS 6502 I get a one byte ugly displacement.


That does not make any sense - it is an extremely niche usage (why odd instead of even vectors?), and you would only have one byte "ugly" displacement instad of having to avoid *any* page crossing using the NMOS 6502, that is not only ugly but also bug prone.


Quote:

brain wrote:
Um, bugs can indeed be documented and still be bugs. All the CPUs I use have a long list of errata on them and people do indeed consider them bugs. Documenting something does not absolve it of guilt.

It is a very slippy ground. IMHO if we have a documented specification and its later changes then we have rather concept changing than a bug...


This are the exact words from the 6502 programming manual:
Quote:
In the JMP Indirect instruction, the second and third bytes of the instruction represent the indirect low and high bytes respectively of the memory location containing ADL.
Once ADL is fetched, the program counter is incremented with the next memory location containing ADH.


So, it does not clarifies what happens when the indirect low byte is $FF, but implies that it would advance to the next page ("the program counter is incremented"). Yes, in the illustration it could be deduced that there is no carry in the "increment of IAL", but then clearly is a "quirk" if the implementation, not a documented fact.

Quote:
1024MAK wrote:
one thing that is more important than features alone, especially in the 1970s and 1980s, was getting the price right (as in low) with just enough features so that it sells.

Thank you very much for your remarks and an interesting link. IMHO prices are very difficult matter for correct conclusions. The Apple II was always rather an expensive computer. The Apple Mac was also not among the cheap computers... There is always market for very expensive designs.


The Apple II was very cheap when it came out compared to other computers of its time. Evidently, in 1980 it was already expensive, but then other cheap computers were using the already old 6502.

IMHO, the only reason the 6502 was popular was its price - and the total system price, because assembling a computer using a 6502 did not need a lot of external circuits. This is the same as the Z80, it also was selected on price.

But this also meant that the 6502 was a dead end - there were very few opportunities for making a faster successor without completely changing the architecture, because it was tied to the RAM speed, and RAM did not increased speed at the same pace as processors. Every computer manufacturer understood that, so they simply moved on to the 8086, the 68000 or even in the case of Acorn designed the ARM.

None of the 8 bit architectures of the '70s migrated to 16 bit, even Intel abandoned 8080 compatibility and simply provided "assembly compatibility" in the 8086, because the designed were all severely limited.

Have Fun!


Top
 Profile  
Reply with quote  
PostPosted: Sat Sep 19, 2020 8:04 am 
Offline

Joined: Sat Jul 09, 2016 6:01 pm
Posts: 180
dmsc wrote:
This are the exact words from the 6502 programming manual:
Quote:
In the JMP Indirect instruction, the second and third bytes of the instruction represent the indirect low and high bytes respectively of the memory location containing ADL.
Once ADL is fetched, the program counter is incremented with the next memory location containing ADH.


So, it does not clarifies what happens when the indirect low byte is $FF, but implies that it would advance to the next page ("the program counter is incremented"). Yes, in the illustration it could be deduced that there is no carry in the "increment of IAL", but then clearly is a "quirk" if the implementation, not a documented fact.

We also have a table which exactly shows how this instruction executes
Attachment:
mos-6500.png
mos-6500.png [ 49.52 KiB | Viewed 1338 times ]

I pointed this table from http://bytecollector.com/archive/misc/6 ... _Jan76.pdf earlier on this forum though...

_________________
my blog about processors


Top
 Profile  
Reply with quote  
PostPosted: Sat Sep 19, 2020 9:02 am 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10986
Location: England
We already have a thread all about this JMP phenomenon - do we have to go over it all again? It's clear that there's more than one perspective, and litwr isn't budging. This is not an argument that can be won.

A new thread specifically on 6800 vs 6502, relative merits of each, is an excellent idea. I'm sure there are plenty of existing threads, and posts in this one, which can be drawn on. Again though, I would assert that there cannot be a single universally accepted answer of which is 'best' - or even which is 'fastest' - although there are useful illustrations and discussions we could share and enjoy.


Top
 Profile  
Reply with quote  
PostPosted: Sat Sep 19, 2020 12:42 pm 
Offline

Joined: Thu Mar 12, 2020 10:04 pm
Posts: 704
Location: North Tejas
litwr wrote:
Let us make a simple routine which makes a simple checksum by XOR-ing of 1024 bytes.
Code:
     ldy #0
     ldx #4
     lda #hi(tab)
     sta loop+2
     tya
loop eor loop,y
     iny
     bne loop

     inc loop+2
     dex
     bne loop

     align 256
tab .byte <data> ;1000 bytes are here

I am sure that the analogous 6800 routine will be 2-4 times slower.


First of all, there is a bug in your eor statement.

Code:
 0200                     00001          org    $200
                          00002
 0200 A0 00           [2] 00003          ldy    #0
 0202 A2 04           [2] 00004          ldx    #4
 0204 A9 03           [2] 00005          lda    #>tab
 0206 8D 020C         [4] 00006          sta    loop+2
 0209 98              [2] 00007          tya
 020A 59 0300       [4/5] 00008 loop     eor    tab,y     ; BUG!  Was loop,y
 020D C8              [2] 00009          iny
 020E D0 FA (020A)  [2/3] 00010          bne    loop
                          00011
 0210 EE 020C         [6] 00012          inc    loop+2
 0213 CA              [2] 00013          dex
 0214 D0 F4 (020A)  [2/3] 00014          bne    loop
                          00015
 0300                     00016          org    $300
                          00017
 0300                     00018 tab
                          00019
                          00020          end

9267 cycles
22 bytes


Code:
 0100                     00001          org    $100
                          00002
 0100 CE 0300         [3] 00003          ldx    #Tab
 0103 4F              [2] 00004          clra
                          00005
 0104                     00006 Loop
 0104 A8 00           [5] 00007          eora   ,X
 0106 08              [4] 00008          inx
                          00009
 0107 8C 0700         [3] 00010          cpx    #Tab+1024
 010A 26 F8 (0104)    [4] 00011          bne    Loop
                          00012
 0300                     00013          org    $300
                          00014
 0300                     00015 Tab
                          00016
                          00017          end

16389 cycles
12 bytes


The 6502 version is not even twice as fast and the 6800 did not have to resort to SMC. Follow the science...


Top
 Profile  
Reply with quote  
PostPosted: Sat Sep 19, 2020 1:32 pm 
Offline

Joined: Thu Mar 12, 2020 10:04 pm
Posts: 704
Location: North Tejas
The 6502 has an advantage of more efficient instructions. Having three byte-sized registers often comes in extremely handy. But the two not-quite accumulators do not always make up for that deficiency. Until the C02, the accumulator cannot be easily incremented or decremented. The lack of 16-bit operations make 16-bit code on the 6502 tedious to write, run slower and take up more space.

The 6800 suffers from having only one index register and no efficient way to get data between it and the accumulatiors. A third byte-sized register is sometimes sorely missed.

For comparison, the following code adds three signed bytes to form a 16-bit result:

Code:
                          00031 ; W0 := S0 + S1 + S2;
                          00032
                          00033 ;   ;  0 := v W0 -> 1
                          00034 ;   ;  1 L r 2
                          00035
                          00036 ;      ;  2 L v S0 -> 3
                          00037 ;      ;  3 + v S1 -> 4
                          00038 ;      ;  4 + v S2
                          00039
                          00040
                          00041 ;  1 L r 2
                          00042 ;  2 L v S0 -> 3
                          00043 ;  3 + v S1 -> 4
 0029 18              [2] 00044          clc
 002A A0 00           [2] 00045          ldy    #0
 002C A5 15           [3] 00046          lda    S0
 002E 10 01 (0031)  [2/3] 00047          bpl    2f
 0030 88              [2] 00048          dey
 0031                     00049 2:
 0031 65 16           [3] 00050          adc    S1
 0033 AA              [2] 00051          tax
 0034 24 16           [3] 00052          bit    S1
 0036 10 01 (0039)  [2/3] 00053          bpl    2f
 0038 88              [2] 00054          dey
 0039                     00055 2:
 0039 98              [2] 00056          tya
 003A 69 00           [2] 00057          adc    #0
                          00058 ;  4 + v S2
 003C 18              [2] 00059          clc
 003D A8              [2] 00060          tay
 003E 8A              [2] 00061          txa
 003F 65 17           [3] 00062          adc    S2
 0041 AA              [2] 00063          tax
 0042 24 17           [3] 00064          bit    S2
 0044 10 01 (0047)  [2/3] 00065          bpl    2f
 0046 88              [2] 00066          dey
 0047                     00067 2:
 0047 98              [2] 00068          tya
 0048 69 00           [2] 00069          adc    #0
                          00070 ;  0 := v W0 -> 1
 004A 86 0D           [3] 00071          stx    W0
 004C 85 0E           [3] 00072          sta    W0+1


Code:
                          00031 * W0 := S0 + S1 + S2;
                          00032
                          00033 *   *  0 := v W0 -> 1
                          00034 *   *  1 L r 2
                          00035
                          00036 *      *  2 L v S0 -> 3
                          00037 *      *  3 + v S1 -> 4
                          00038 *      *  4 + v S2
                          00039
                          00040
                          00041 *  1 L r 2
                          00042 *  2 L v S0 -> 3
                          00043 *  3 + v S1 -> 4
 0029 D6 15           [3] 00044          ldab   S0
 002B 86 7F           [2] 00045          ldaa   #$7F      ; Thanks Mike B!
 002D 11              [2] 00046          cba
 002E 82 7F           [2] 00047          sbca   #$7F
 0030 7D 0016         [6] 00048          tst    S1
 0033 2A 01 (0036)    [4] 00049          bpl    2f
 0035 4A              [2] 00050          deca
 0036                     00051 2:
 0036 DB 16           [3] 00052          addb   S1
 0038 89 00           [2] 00053          adca   #0
                          00054 *  4 + v S2
 003A 7D 0017         [6] 00055          tst    S2
 003D 2A 01 (0040)    [4] 00056          bpl    2f
 003F 4A              [2] 00057          deca
 0040                     00058 2:
 0040 DB 17           [3] 00059          addb   S2
 0042 89 00           [2] 00060          adca   #0
                          00061 *  0 := v W0 -> 1
 0044 97 0D           [4] 00062          staa   W0
 0046 D7 0E           [4] 00063          stab   W0+1


Top
 Profile  
Reply with quote  
PostPosted: Sat Sep 19, 2020 1:33 pm 
Offline

Joined: Thu Mar 12, 2020 10:04 pm
Posts: 704
Location: North Tejas
For perhaps a more representative benchmark, here is the skeleton of a FOR-NEXT loop in compiled BASIC on both processors.

This is pseudocode for the loop:

Code:
    remember current line for error reporting
    obtain for loop state record
    initialize in state record
    load initial value into loop variable
    goto loop body

loop top:
    step loop variable
    compare with end value
    if end condition satisfied:
        drop state record
        go to loop exit
    else:
        store new loop variable value
        go to loop body

loop body:

    <do arbitrary stuff>

    verify loop variable
    goto loop top

loop exit:


For the 6502:

Code:
                          00293 ; 100 for I = 1 to 10 : next I
 0B0C                     00294 L00100:
                          00295          ifdef  __TRACE
                          00296          ldx    #<100
                          00297          lda    #>100
                          00298          jsr    Trace
                          00299          endif
                          00300          ifdef  __ATLIN
 0B0C A2 0C           [2] 00301          ldx    #<L00100
 0B0E 8E 0F48         [4] 00302          stx    ResLn_
 0B11 A9 0B           [2] 00303          lda    #>L00100
 0B13 8D 0F49         [4] 00304          sta    ResLn_+1
                          00305          endif
                          00306
 0B16 20 0C9D         [6] 00307          jsr    ForEnter
                          00308
 0B19 A0 04           [2] 00309          ldy    #4
 0B1B A9 35           [2] 00310          lda    #<T00000
 0B1D 91 1E           [6] 00311          sta    (Ptr0),Y
 0B1F C8              [2] 00312          iny
 0B20 A9 0B           [2] 00313          lda    #>T00000
 0B22 91 1E           [6] 00314          sta    (Ptr0),Y
                          00315
 0B24 C8              [2] 00316          iny
 0B25 A9 52           [2] 00317          lda    #<I_
 0B27 91 1E           [6] 00318          sta    (Ptr0),Y
 0B29 C8              [2] 00319          iny
 0B2A A9 0F           [2] 00320          lda    #>I_
 0B2C 91 1E           [6] 00321          sta    (Ptr0),Y
                          00322
 0B2E A2 01           [2] 00323          ldx    #<1
 0B30 A0 00           [2] 00324          ldy    #>1
                          00325
 0B32 4C 0B53         [3] 00326          jmp    T00003
                          00327
 0B35                     00328 T00000:
 0B35 A9 01           [2] 00329          lda    #<1
 0B37 18              [2] 00330          clc
 0B38 6D 0F52         [4] 00331          adc    I_
 0B3B AA              [2] 00332          tax
 0B3C A9 00           [2] 00333          lda    #>1
 0B3E 6D 0F53         [4] 00334          adc    I_+1
 0B41 70 0B (0B4E)  [2/3] 00335          bvs    T00001
 0B43 A8              [2] 00336          tay
                          00337
 0B44 E0 0B           [2] 00338          cpx    #<11      ; Thanks Mike B!
 0B46 E9 00           [2] 00339          sbc    #>11
 0B48 50 02 (0B4C)  [2/3] 00340          bvc    2f
 0B4A 49 80           [2] 00341          eor    #$80
                          00342
 0B4C                     00343 2:
 0B4C 30 03 (0B51)  [2/3] 00344          bmi    T00002
                          00345
 0B4E                     00346 T00001:
 0B4E 4C 0CF0         [3] 00347          jmp    ForExit
                          00348
 0B51                     00349 T00002:
 0B51 68              [4] 00350          pla
 0B52 68              [4] 00351          pla
                          00352
 0B53                     00353 T00003:
 0B53 8E 0F52         [4] 00354          stx    I_
 0B56 8C 0F53         [4] 00355          sty    I_+1
                          00356
 0B59 A2 52           [2] 00357          ldx    #<I_
 0B5B A9 0F           [2] 00358          lda    #>I_
 0B5D 20 0C5D         [6] 00359          jsr    ForNext


Library support code:

Code:
.                         00637 ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
.                         00638 ;
.                         00639 ; For stack entry
.                         00640 ;
.                         00641 ; 0..1 - Next address
.                         00642 ; 2..3 - Prev address
.                         00643 ; 4..5 - Top of loop address
.                         00644 ; 6..7 - Variable address
.                         00645 ;
.0008                     00646 FORSIZE  equ    8
.                         00647
.                         00648 ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
.                         00649 ;
.                         00650 ; ForNext - Process NEXT statement
.                         00651 ;
.                         00652 ; Input:
.                         00653 ;       A:X = variable address
.                         00654 ;
.                         00655 ; Uses:
.                         00656 ;       Ptr0
.                         00657 ;
.0C5D                     00658 ForNext:
.0C5D A4 24           [3] 00659          ldy    ForTop    ; Point to current context
.0C5F 84 1E           [3] 00660          sty    Ptr0
.0C61 A4 25           [3] 00661          ldy    ForTop+1
.0C63 84 1F           [3] 00662          sty    Ptr0+1
.0C65 D0 04 (0C6B)  [2/3] 00663          bne    ForN0
.0C67 A4 1E           [3] 00664          ldy    Ptr0
.0C69 F0 2D (0C98)  [2/3] 00665          beq    ForN2     ; Not currently within a FOR loop
.                         00666
.0C6B                     00667 ForN0:
.0C6B A0 07           [2] 00668          ldy    #7
.0C6D D1 1E         [5/6] 00669          cmp    (Ptr0),Y  ; Compare variable address
.0C6F D0 15 (0C86)  [2/3] 00670          bne    ForN1     ; Loop variable mismatch
.0C71 88              [2] 00671          dey
.0C72 8A              [2] 00672          txa
.0C73 D1 1E         [5/6] 00673          cmp    (Ptr0),Y
.0C75 D0 0F (0C86)  [2/3] 00674          bne    ForN1
.                         00675
.0C77 A0 04           [2] 00676          ldy    #4
.0C79 B1 1E         [5/6] 00677          lda    (Ptr0),Y  ; Jump to top of loop
.0C7B AA              [2] 00678          tax
.0C7C C8              [2] 00679          iny
.0C7D B1 1E         [5/6] 00680          lda    (Ptr0),Y
.0C7F 86 1E           [3] 00681          stx    Ptr0
.0C81 85 1F           [3] 00682          sta    Ptr0+1
.0C83 6C 001E         [5] 00683          jmp    (Ptr0)
.                         00684
.0C86                     00685 ForN1
.0C86 A0 02           [2] 00686          ldy    #2
.0C88 B1 1E         [5/6] 00687          lda    (Ptr0),Y  ; Get previous context
.0C8A 85 24           [3] 00688          sta    ForTop    ; Save it
.0C8C AA              [2] 00689          tax
.0C8D C8              [2] 00690          iny
.0C8E B1 1E         [5/6] 00691          lda    (Ptr0),Y
.0C90 85 25           [3] 00692          sta    ForTop+1
.0C92 D0 D7 (0C6B)  [2/3] 00693          bne    ForN0     ; Retry if valid
.0C94 A4 24           [3] 00694          ldy    ForTop
.0C96 D0 D3 (0C6B)  [2/3] 00695          bne    ForN0     ; Retry if valid
.                         00696
.0C98                     00697 ForN2
.0C98 A9 3E           [2] 00698          lda    #62       ; Report FOR-NEXT nesting error
.0C9A 4C 0D05         [3] 00699          jmp    ErrH
.                         00700
.                         00701 ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
.                         00702 ;
.                         00703 ; ForEnter - Create a new FOR loop context
.                         00704 ;
.                         00705 ; Input:
.                         00706 ;       ForTop = address of current context (0 if not in FOR loop)
.                         00707 ;       ForStack = bottom of context stack
.                         00708 ;
.                         00709 ; Output:
.                         00710 ;       Ptr0 = the address of the FOR stack entry
.                         00711 ;
.                         00712 ; Uses
.                         00713 ;       Ptr1
.                         00714 ;
.0C9D                     00715 ForEnter:
.0C9D A5 24           [3] 00716          lda    ForTop
.0C9F 05 25           [3] 00717          ora    ForTop+1
.0CA1 F0 18 (0CBB)  [2/3] 00718          beq    ForE0     ; Not currently within a FOR loop
.                         00719
.0CA3 A6 24           [3] 00720          ldx    ForTop    ; Point to most recent entry
.0CA5 86 20           [3] 00721          stx    Ptr1
.0CA7 A5 25           [3] 00722          lda    ForTop+1
.0CA9 85 21           [3] 00723          sta    Ptr1+1
.                         00724
.0CAB A0 00           [2] 00725          ldy    #0
.0CAD B1 20         [5/6] 00726          lda    (Ptr1),Y  ; Get next address
.0CAF AA              [2] 00727          tax
.0CB0 C8              [2] 00728          iny
.0CB1 B1 20         [5/6] 00729          lda    (Ptr1),Y
.0CB3 D0 0A (0CBF)  [2/3] 00730          bne    ForE1
.0CB5 E0 00           [2] 00731          cpx    #0
.0CB7 F0 0F (0CC8)  [2/3] 00732          beq    ForE2     ; No next, go allocate it
.0CB9 D0 04 (0CBF)  [2/3] 00733          bne    ForE1
.                         00734
.0CBB                     00735 ForE0:
.0CBB A2 54           [2] 00736          ldx    #<ForStack  ; Start with the bottom of the stack
.0CBD A9 0F           [2] 00737          lda    #>ForStack
.                         00738
.0CBF                     00739 ForE1:
.0CBF 86 24           [3] 00740          stx    ForTop    ; It is now the current context
.0CC1 85 25           [3] 00741          sta    ForTop+1
.0CC3 86 1E           [3] 00742          stx    Ptr0
.0CC5 85 1F           [3] 00743          sta    Ptr0+1
.                         00744
.0CC7 60              [6] 00745          rts
.                         00746
.0CC8                     00747 ForE2:
.0CC8 A2 08           [2] 00748          ldx    #FORSIZE  ; Allocate a new entry
.0CCA A9 00           [2] 00749          lda    #0
.0CCC 20 0E5B         [6] 00750          jsr    Alloc
.                         00751
.0CCF A9 00           [2] 00752          lda    #0
.0CD1 A8              [2] 00753          tay
.0CD2 91 1E           [6] 00754          sta    (Ptr0),Y  ; Set next to nil
.0CD4 C8              [2] 00755          iny
.0CD5 91 1E           [6] 00756          sta    (Ptr0),Y
.                         00757
.0CD7 A5 24           [3] 00758          lda    ForTop    ; Set prev pointer
.0CD9 C8              [2] 00759          iny
.0CDA 91 1E           [6] 00760          sta    (Ptr0),Y
.0CDC A5 25           [3] 00761          lda    ForTop+1
.0CDE C8              [2] 00762          iny
.0CDF 91 1E           [6] 00763          sta    (Ptr0),Y
.                         00764
.0CE1 A5 1E           [3] 00765          lda    Ptr0      ; Store address of allocation
.0CE3 A0 00           [2] 00766          ldy    #0
.0CE5 91 20           [6] 00767          sta    (Ptr1),Y
.0CE7 AA              [2] 00768          tax
.0CE8 A5 1F           [3] 00769          lda    Ptr0+1
.0CEA C8              [2] 00770          iny
.0CEB 91 20           [6] 00771          sta    (Ptr1),Y
.                         00772
.0CED 4C 0CBF         [3] 00773          jmp    ForE1     ; Return it
.                         00774
.                         00775 ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
.                         00776 ;
.                         00777 ; ForExit - Drop a FOR loop context
.                         00778 ;
.                         00779 ; Input:
.                         00780 ;       ForTop = current context
.                         00781 ;
.                         00782 ; Output:
.                         00783 ;       ForTop = the previous context
.                         00784 ;       X = the address of the context
.                         00785 ;
.0CF0                     00786 ForExit:
.0CF0 A6 24           [3] 00787          ldx    ForTop    ; Point to most recent entry
.0CF2 86 1E           [3] 00788          stx    Ptr0
.0CF4 A6 25           [3] 00789          ldx    ForTop+1
.0CF6 86 1F           [3] 00790          stx    Ptr0+1
.                         00791
.0CF8 A0 02           [2] 00792          ldy    #2
.0CFA B1 1E         [5/6] 00793          lda    (Ptr0),Y  ; Point to previous entry
.0CFC AA              [2] 00794          tax
.0CFD C8              [2] 00795          iny
.0CFE B1 1E         [5/6] 00796          lda    (Ptr0),Y
.                         00797
.0D00 86 24           [3] 00798          stx    ForTop    ; It is now the top
.0D02 85 25           [3] 00799          sta    ForTop+1
.                         00800
.0D04 60              [6] 00801          rts


For the 6800:

Code:
                          00117 * 100 for I = 1 to 10 : next I
 0109                     00118 L00100
                          00119          ifdef  __TRACE
                          00120          ldx    #100
                          00121          jsr    Trace
                          00122          endif
                          00123          ifdef  __ATLIN
 0109 CE 0109         [3] 00124          ldx    #L00100
 010C FF 0426         [6] 00125          stx    ResLn_
                          00126          endif
                          00127
 010F BD 0231         [9] 00128          jsr    ForEnter
                          00129
 0112 86 01           [2] 00130          ldaa   #T00000>>8
 0114 A7 04           [6] 00131          staa   4,X
 0116 86 28           [2] 00132          ldaa   #T00000&$FF
 0118 A7 05           [6] 00133          staa   5,X
                          00134
 011A 86 04           [2] 00135          ldaa   #I_>>8
 011C A7 06           [6] 00136          staa   6,X
 011E 86 30           [2] 00137          ldaa   #I_&$FF
 0120 A7 07           [6] 00138          staa   7,X
                          00139
 0122 86 00           [2] 00140          ldaa   #1>>8
 0124 C6 01           [2] 00141          ldab   #1&$FF
                          00142
 0126 20 19 (0141)    [4] 00143          bra    T00003
                          00144
 0128                     00145 T00000
 0128 86 00           [2] 00146          ldaa   #1>>8
 012A C6 01           [2] 00147          ldab   #1&$FF
                          00148
 012C FB 0431         [4] 00149          addb   I_+1
 012F B9 0430         [4] 00150          adca   I_
                          00151
 0132 81 00           [2] 00152          cmpa   #10>>8
 0134 22 06 (013C)    [4] 00153          bhi    T00001
 0136 25 07 (013F)    [4] 00154          blo    T00002
 0138 C1 0A           [2] 00155          cmpb   #10&$FF
 013A 23 03 (013F)    [4] 00156          bls    T00002
                          00157
 013C                     00158 T00001
 013C 7E 0269         [3] 00159          jmp    ForExit
                          00160
 013F                     00161 T00002
 013F 31              [4] 00162          ins
 0140 31              [4] 00163          ins
                          00164
 0141                     00165 T00003
 0141 B7 0430         [5] 00166          staa   I_
 0144 F7 0431         [5] 00167          stab   I_+1
                          00168
                          00169
 0147 86 04           [2] 00170          ldaa   #I_>>8
 0149 C6 30           [2] 00171          ldab   #I_&$FF
 014B BD 0216         [9] 00172          jsr    ForNext


Library support code:

Code:
.                         00384 ******************************************************************************
.                         00385 *
.                         00386 * For stack entry
.                         00387 *
.                         00388 * 0..1 - Next address
.                         00389 * 2..3 - Prev address
.                         00390 * 4..5 - Top of loop address
.                         00391 * 6..7 - Variable address
.                         00392 *
.0008                     00393 FORSIZE  equ    8
.                         00394
.                         00395 ******************************************************************************
.                         00396 *
.                         00397 * ForNext - Process NEXT statement
.                         00398 *
.                         00399 * Input:
.                         00400 *       A:B = variable address
.                         00401 *
.0216                     00402 ForNext
.0216 DE 12           [4] 00403          ldx    ForTop    ; Point to current context
.0218 27 12 (022C)    [4] 00404          beq    ForN2     ; Not currently within a FOR loop
.                         00405
.021A                     00406 ForN0
.021A A1 06           [5] 00407          cmpa   6,X       ; Compare variable address
.021C 26 08 (0226)    [4] 00408          bne    ForN1     ; Loop variable mismatch
.021E E1 07           [5] 00409          cmpb   7,X
.0220 26 04 (0226)    [4] 00410          bne    ForN1
.                         00411
.0222 EE 04           [6] 00412          ldx    4,X       ; Jump to top of loop
.0224 6E 00           [4] 00413          jmp    ,X
.                         00414
.0226                     00415 ForN1
.0226 EE 02           [6] 00416          ldx    2,X       ; Get previous context
.0228 DF 12           [5] 00417          stx    ForTop    ; Save it
.022A 26 EE (021A)    [4] 00418          bne    ForN0     ; Retry if valid
.                         00419
.022C                     00420 ForN2
.022C 86 3E           [2] 00421          ldaa   #62       ; Report FOR-NEXT nesting error
.022E 7E 0270         [3] 00422          jmp    ErrH
.                         00423
.                         00424 ******************************************************************************
.                         00425 *
.                         00426 * ForEnter - Create a new FOR loop context
.                         00427 *
.                         00428 * Input:
.                         00429 *       ForTop = address of current context (0 if not in FOR loop)
.                         00430 *       ForStack = bottom of context stack
.                         00431 *
.                         00432 * Output:
.                         00433 *       X = the address of the FOR stack entry
.                         00434 *
.0231                     00435 ForEnter
.0231 DE 12           [4] 00436          ldx    ForTop    ; Point to most recent entry
.0233 27 0D (0242)    [4] 00437          beq    ForE1     ; Not currently within a FOR loop
.                         00438
.0235 A6 00           [5] 00439          ldaa   ,X        ; Get next address
.0237 E6 01           [5] 00440          ldab   1,X
.0239 26 03 (023E)    [4] 00441          bne    ForE0
.023B 4D              [2] 00442          tsta
.023C 27 0A (0248)    [4] 00443          beq    ForE3     ; No next, go allocate it
.                         00444
.023E                     00445 ForE0
.023E EE 00           [6] 00446          ldx    ,X        ; Use the next one
.                         00447
.0240 20 03 (0245)    [4] 00448          bra    ForE2
.                         00449
.0242                     00450 ForE1
.0242 CE 0432         [3] 00451          ldx    #ForStack ; Start with the bottom of the stack
.                         00452
.0245                     00453 ForE2
.0245 DF 12           [5] 00454          stx    ForTop    ; It is now the current context
.                         00455
.0247 39              [5] 00456          rts
.                         00457
.0248                     00458 ForE3
.0248 CE 0008         [3] 00459          ldx    #FORSIZE  ; Allocate a new entry
.024B BD 0385         [9] 00460          jsr    Alloc
.                         00461
.024E 4F              [2] 00462          clra
.024F A7 00           [6] 00463          staa   ,X        ; Set next to nil
.0251 A7 01           [6] 00464          staa   1,X
.                         00465
.0253 96 12           [3] 00466          ldaa   ForTop    ; Set prev pointer
.0255 A7 02           [6] 00467          staa   2,X
.0257 96 13           [3] 00468          ldaa   ForTop+1
.0259 A7 03           [6] 00469          staa   3,X
.                         00470
.025B DE 12           [4] 00471          ldx    ForTop    ; Store address of allocation
.025D 96 0C           [3] 00472          ldaa   Ptr0
.025F A7 00           [6] 00473          staa   ,X
.0261 96 0D           [3] 00474          ldaa   Ptr0+1
.0263 A7 01           [6] 00475          staa   1,X
.                         00476
.0265 DE 0C           [4] 00477          ldx    Ptr0      ; Point to new entry
.                         00478
.0267 20 DC (0245)    [4] 00479          bra    ForE2     ; Return it
.                         00480
.                         00481 ******************************************************************************
.                         00482 *
.                         00483 * ForExit - Drop a FOR loop context
.                         00484 *
.                         00485 * Input:
.                         00486 *       ForTop = current context
.                         00487 *
.                         00488 * Output:
.                         00489 *       ForTop = the previous context
.                         00490 *       X = the address of the context
.                         00491 *
.0269                     00492 ForExit
.0269 DE 12           [4] 00493          ldx    ForTop    ; Point to most recent entry
.                         00494
.026B EE 02           [6] 00495          ldx    2,X       ; Point to previous entry
.                         00496
.026D DF 12           [5] 00497          stx    ForTop    ; It is now the top
.                         00498
.026F 39              [5] 00499          rts


The scoreboard for the 6502:

Code:
   12   remember current line for error reporting
   37   obtain for loop state record
   40   initialize in state record
   15   load initial value into loop variable
        goto loop body

loop top:
10x16   step loop variable
 9x22   compare with end value
 1x13
        if end condition satisfied:
   43        drop state record
             go to loop exit
        else:
  9x8        store new loop variable value
             go to loop body

loop body:

        <do arbitrary stuff>

10x72   verify loop variable
        goto loop top

loop exit:

Total = 1310 cycles

252 bytes


The scoreboard for the 6800:

Code:
    9   remember current line for error reporting
   30   obtain for loop state record
   32   initialize in state record
   18   load initial value into loop variable
        goto loop body

loop top:
10x12   step loop variable
 9x24   compare with end value
 1x16
        if end condition satisfied:
   23       drop state record
            go to loop exit
        else:
 9x10       store new loop variable value
            go to loop body

loop body:

        <do arbitrary stuff>

10x49   verify loop variable
        goto loop top

loop exit:

Total = 1044 cycles

159 bytes


When I began this experiment, I had no idea which processor would do better. The only thing obvious was that the semantics of loops in BASIC make them less efficient than those in other languages. I have more experience programming the 6800 than the 6502, so that version of the code may not be as tight. Remember that I have strong incentive to make both compilers generate the best code possible.


Top
 Profile  
Reply with quote  
PostPosted: Sat Sep 19, 2020 8:48 pm 
Offline

Joined: Tue May 05, 2009 2:49 pm
Posts: 113
litwr wrote:
I used words "rather mediocre" which give me some room for maneuvers. :) And BTW I have checked a dictionary and found out the next definition
Quote:
The roots of the adjective mediocre are from the Latin medial, "middle," and ocris, "mountain." If you think about it, the middle of a mountain is neither up nor down and neither here nor there — just somewhere in between. The definition of mediocre is "of ordinary quality," "merely adequate," and "average.

It is nothing offensive in this word, it means just satisfactory, almost quite well, but not outstanding or bad.

In your original post, you asked for any comments around use of English. As a native English speaker, I can assure you "mediocre" is not viewed to an English reader as "satisfactory". Note the synonyms for "mediocre" from Thesaurus.com:
My point is that the word mediocre (which I agree objectively states that something is "average" or "median" in operation) carries a connotation of being used with the author/speaker wishes to convey the product should have been more distinguished or capable, but failed to do so. Thus, the English reader will see:

"The 6800 is a mediocre processor"

as

"The 6800 had the potential to be a better or more capable processor, but failed to live up to that expectation and is just average."
Quote:
Sorry I haven't understood your point. Would you like to clarify it? You can check my material about the TMS9900, I have never used word "mediocre" there - https://litwr.livejournal.com/1575.html - however I mentioned several weak points of this processor.

You didn't quote enough of my response. I said:
Quote:
But, calling something "mediocre" is just asking for the fight. Case in point, the TMS9900 16 bit processor in the TI 99/4A is hardly a "mediocre" processor. But, the performance of the TI 99/4A suffered greatly due to system design constraints that strangled the poor processor's capabilities.

I'm not comparing the TMS9900, I'm suggesting that the use of the word will put the reader on the defensive, as most TMS9900 enthusiasts are when people claim that processor is "mediocre", when in fact it's due more to the constraints imposed by the TI 99/4A design (and that's a statement TI 99 4/A folks will argue). In short, my point is that "mediocre" is a word best used if you want to dismiss or belittle something and you can't say the item has no value or that it is less than average.
Quote:
I can't agree. IMHO the 6502 is much better with processing of any tabular data. Let us make a simple routine which makes a simple checksum by XOR-ing of 1024 bytes.
Code:
     ldy #0       ;2
     ldx #4       ;2
     lda #hi(tab) ;2
     sta loop+2   ;3
     tya          ;2
loop eor loop,y   ;4 I think this takes an extra cycle as page flip over so shuld add 1+1+1 (3) more, but ignore
     iny          ;2
     bne loop     ;2

     inc loop+2   ;5
     dex          ;2
     bne loop     ;2

     align 256
tab .byte <data> ;1000 bytes are here

I am sure that the analogous 6800 routine will be 2-4 times slower.

11 cycles to setup, though the tya can be skipped if you do the lda #0 in place of it, which makes it 9 cycles. 8 * 1024 + 9 * 4 = 8228
Code:
ldaa #0       ;2
ldx #1000     ;4
loop:
dex           ;4
eora tab,X    ;5
bne loop      ;4

6 cycles to setup, and 13*1024 ( 13312) cycles to run the loop (And, I've literally never written any 6800 asm, so I suspect there's a faster way to do, but the current speed is a 1.6X advantage
Quote:
brain wrote:
Saying Bill did not improve the NMOS 6502 seems highly editorial to me. He improved it by moving it to CMOS.

This is exactly what I wrote about...

We shall agree to disagree then. I think saying a designer of a chip did not improve the chip when everyone knows he designed the much improved successor to the IC means he improved the chip.
Quote:
Please could you explain why is it possible to think that I wrote about the CPU's fault? I wrote about an unfriendly environment around the 6502 systems at 2 MHz in the USA until the second half of the 80s.

The original article states:
Quote:
Only in 1985, when the era of 8-bit technology began to go away, did the Commodore 128 appear which could use in one of its modes the 6502 at 2 MHz clock. But even here it turned out to be more of a joke since this mode was practically not supported and there are practically no programs for it.

The statement implies that the 2MHz mode (which you belittle a bit in your sentence by noting it could only be used in one of the modes, which is not completely true, you could run the 64 mode in 2MHz without the VIC-II running, which some 64 apps used to speed up code running during VBLANK on the C128 in 64 mode). was a joke because noone targeted the "mode", but the sentence implies the "mode" is "the 2MHz mode". But, that's misleading. No one targeted the C128 mode because the unit would run perfectly fine in 64 mode, and targeting that mode gave the software developer/seller a 20 Million customer base as opposed to a 4.5Million potential customer base. It had nothing to do with the speed of the CPU, but (again), the issues with the PC the 2MHz operation was contained within
Quote:
We discussed one my example earlier on this forum but let me repeat. I uses a 256-byte table with 128 jump vectors. Those vectors must have odd addresses. On the NMOS 6502 I pack such a table in a single page but on the CMOS 6502 I get a one byte ugly displacement.

Your statement that 65C02 requires "more cumbersome" code seems pretty general to apply in the document given this specific use case. It only exists because you have 128 2 byte vectors, which is a pretty extreme example, and it's a 1 byte offset. Hardly horrid compared to the atrocities others accuse the 6502 crowd concerning self modifying code (the 6809 folks are particularly passionate that such code is hugely cumbersome and horrible, but we tend to ignore them :-)
Quote:
Sorry I meant Asteroid game for an Atari computer. Bill told about it - https://www.youtube.com/watch?v=7YoolSA ... be&t=28230
It is strange I could not google this information and I am sure that it was written somewhere on the net. Internet really has viruses which are eating valuable information. :(

I'll have to look that up.
Quote:
It is very interesting but it proves my point that Bill couldn't replace a company. He is a genuine engineer but he needed also a marketing specialists who could provide sells and get more customers. Some assistance from other engineers could help too.

I think you're being too harsh on Bill in the article, but it's your prerogative to do so, so I have no further comment.
Quote:
Maybe word bias is not right for this case? I really don't have a special personal opinion about the 6502 or other processors. In a sense I like them all but I also like to examine them carefully and find their drawbacks. Maybe the main problem is my English because in Russian my material has about 7 times greater popularity.

Well, that's why I continue to respond to the thread. I gather you're interested in mastering the English language, and what better way than to write about technical subjects you already know about.
Quote:
I still not understand your point about my phrase "the 6502 was only microscopically improved and made artificially partially incompatible with itself" in relation to the 4510. This phrase relates to the 65C02 only. Information about the 4510 follows much later.

My point was that the statement doesn't have a time frame attached to it. You can read it as having an implicit "By the mid 1980's" prepended to the paragraph in which the "microscopically improved" sentence exists, but I read it as being more a comment for all time. In other words, I read it as:
Quote:
the 6502 design was never materially improved and the minimal improvement made in the CPU line created partially incompatible devices.

Quote:
It is a very slippy ground. IMHO if we have a documented specification and its later changes then we have rather concept changing than a bug...

I'm not disagreeing per se. I'm just noting that the paragraph appears to say "documentation" != "bug", and that's not true, as much as we may disagree. I think everyone agrees if was a bug, or at least an unplanned use case, so it was documented as opposed to fixing (they fixed the ROR bug).
Quote:
My phrase states "It has turned out that Bill worked on the 6502, with only specifications received, and he never tried to improve this processor himself". Maybe it is something wrong with my English but I want to say exactly the same things you have claimed that Bill Mensch just followed the market, he made things only in response to the market demands. Maybe I should use phrase "he never took the initiative to improve this processor"?
IMHO Faggin created rather a completely new processor which has compatibility with the 8080. He designed its ISA, implemented a technological process to make the Z80, he also participated marketing of the Z80. Bill Mensch got the 6502's ISA ready from Chuck. He had a great role in the NMOS 6502 implementation and he was the only designer of the CMOS 6502. He made the great work in field of electronics but the ISA design and marketing were not his fields.
Bill said good words that the NMOS 6502 changed the world but the CMOS 6502 were produced in larger volumes. IMHO the existence of this forum and the presence of us here is due to the NMOS 6502, the CMOS 6502 just exploited the success of its very successful variant.

I don't agree that the Fagin 8080->Z80 ISA differs from Mensch 6502 -> 65C02 ISA. I don't think Fagin designed the Z80 ISA, but rather took the ready made ISA from the 8080 and extended it, just like Bill extended the 6502 ISA. (The fact that Fagin also designed the 8080 ISA is not relevant here).
As I said before, I think he improved the processor by moving it to CMOS and cleaning up the illegal opcodes. Our disagreement may stem from my feeling that faster speed and movement to CMOS gave the 6502 line a longevity that it would never have achieved in the NMOS variant, thus I value that significantly, and I think you do not. You appear to give more weight to additional opcodes and functions on-chip, like multiply and divide instructions and such. I don't think either is wrong per se, but if Bill had added a ton of new opcodes but had not moved it to CMOS, it would never have hit the speeds we now see, and it would never have been popular in the embedded designs (all of them needed a CMOS design to lay into their SOC designs). This forum would have suffered as a result.
Quote:
Bil Herd and Commodore got a great commercial success with the C128, so I hope our theoretical discussion can't harm anyone. I don't understand why you write about 3-4 MHz. They didn't even actually provide 2 MHz. Indeed they provided 2 MHz for the 6502 but rather theoretically, I wrote about this. IMHO Commodore just heavily and rude exploited people expectations about an upgraded C64 instead of making a real upgrade. They just gave the C64 with better but slower Basic and a bunch of incompatible modes which require special hardware to get benefits from them. The Z80 at the actual 1.6 MHz looked rather very poorly in 1985.

The 3-4 MHz refers to the actual bus speed of the system. In dual memory access designs, in which two devices share a common memory by time slicing access to the memory (of which the VIC-20, 64, 128, and others were in this camp), the bus speed has to run at twice the CPU speed. So, in the 1MHz 64, the memory has to run at 2MHz, and in the C128, the memory has to run at 4MHz. My point was that DRAM speeds of the day, coupled with the dual memory access designs used in these machines, limited CPU speed.
But, I completely agree with the rest of your statement (note, though, that while I agree with the 1.6MHz Z80, I think any discussion it's anemic performance needs to tell the rest of the story, that there was never an intent to put the Z80 into the machine, but it was a mandate by Marketing).
Quote:
If he reads this thread I must express my great and sincere respect to him here. I also hope that he will tell us more about the 6502 history details someday.

I don't think he's on here, but he's been wildly successful in spite of his lack of improving the 6502.
Quote:
It has phrase
Quote:
Thus, MOS Technology has left holes in the 650X instruction bit pattern to accommodate a "quasi-16-bit machine."

;)

I will note my eyes are not what they used to be, but I just pored over the ad, and I can't see any statement like that in the ad.
Quote:
The 16-bit data bus is a quite right idea for me but I don't understand your words about 16-bit ALU. Doesn't the 65816 have 16-bit ALU?! It is shocking for me.

It does, my bad. I used a piece of wrong information when composing my response (given the 8 bit way it accesses memory, it could have used an 8 bit ALU and just called it once for each memory access, but the datasheet says it has a complete 16 bit ALU.


Top
 Profile  
Reply with quote  
PostPosted: Sat Sep 19, 2020 11:40 pm 
Offline

Joined: Thu Mar 12, 2020 10:04 pm
Posts: 704
Location: North Tejas
brain wrote:
Code:
ldaa #0       ;2
ldx #1000     ;4
loop:
dex           ;4
eora tab,X    ;5
bne loop      ;4

6 cycles to setup, and 13*1024 ( 13312) cycles to run the loop (And, I've literally never written any 6800 asm, so I suspect there's a faster way to do, but the current speed is a 1.6X advantage


Unfortunately, the 6800 only allows a single unsigned byte offset to the index register.
Quote:
eora tab,X ;5


However, that code will work if tab started in the zero page...

...which can be done on the 6800 because there is nothing in the way, like say, a stack...


Top
 Profile  
Reply with quote  
PostPosted: Sun Sep 20, 2020 6:36 am 
Offline

Joined: Tue May 05, 2009 2:49 pm
Posts: 113
Yep, I saw the 8 bit offset restriction, but I figured the table could start in the first 256 bytes... But, I also made sure the self modifying code in the 6502 case ran from zpage as well, since the inc of the memory would be faster, and I seem to remember one other opcode being faster. I figured I'd give both options the best chance for success.

I agree with the sentiment that the 2 8 bit index regs seem more useful than a single 16 bit one, but I still feel the 6800 can hold it's own (1.6X for a year+ later in development and release, as I recall, seems appropriate.

Jim


Top
 Profile  
Reply with quote  
PostPosted: Sun Sep 20, 2020 3:20 pm 
Offline

Joined: Wed Jan 08, 2014 3:31 pm
Posts: 578
dmsc wrote:
IMHO, the only reason the 6502 was popular was its price - and the total system price, because assembling a computer using a 6502 did not need a lot of external circuits. This is the same as the Z80, it also was selected on price.

Isn't this the classic price/performance curve? If a CPU's performance is less, but its price is even lower, then it is still be at a more favorable position on that curve than a CPU offers more performance at a much higher price. That is a form of better, just one that's less obvious in hindsight.

dmsc wrote:
But this also meant that the 6502 was a dead end - there were very few opportunities for making a faster successor without completely changing the architecture, because it was tied to the RAM speed, and RAM did not increased speed at the same pace as processors.

Any architecture where the RAM address size is larger than CPU register size has this problem. The CPU registers are too small, so it needs an off chip way to construct addresses. The 6502 used page zero RAM effectively as a register file, which is not unlike how the TMS9900 used the workspace pointer to index into RAM for its 16 bit registers.

Interestingly, the 1802 was an eight bit that had sixteen bit registers on chip. But it couldn't load them directly, all data had to pass through the D accumulator via eight bit put and get instructions. So both the 6502 and 1802 spent a fair amount of time moving data through the accumulator into the address registers.


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 84 posts ]  Go to page Previous  1, 2, 3, 4, 5, 6  Next

All times are UTC


Who is online

Users browsing this forum: No registered users and 12 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: