6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Wed May 15, 2024 12:54 am

All times are UTC




Post new topic Reply to topic  [ 34 posts ]  Go to page 1, 2, 3  Next
Author Message
PostPosted: Fri Jun 03, 2016 4:20 am 
Offline

Joined: Fri Jun 03, 2016 3:42 am
Posts: 158
Hello! I just registered for this forum because Michael Barry mentioned the forum on comp.lang.forth.

Are any of you familiar with ISYS Forth for the Apple-IIc?

About 25 years ago I was programming in it. It is an STC Forth. The most notable thing about it was that it used a split parameter-stack. The low-bytes were accessed with $0,x and the high-bytes were accessed with $40,x --- this means that a single INX would drop a 16-bit element from the parameter-stack and a single DEX would make room for a new element. All the producers (words that put data on the stack) would end with DEX and all the consumers (words that remove data from the stack) would start with INX. This is somewhat counter-intuitive and it requires that you work with memory on the stack in the producer that is actually under the stack because the DEX hasn't been done yet. Similarly, you work with memory on the stack in the consumer that is actually under the stack because the INX has already been done. The point of this is that typically a producer is followed by a consumer, so the DEX and INX are adjacent --- the peephole optimizer can remove both of them! Sometimes the producer gets inlined but the consumer does not. In this case, the optimizer inlines the producer without the DEX on the end, and it compiles a JSR to 1+ the address of the consumer so that the INX at the start of the consumer doesn't get executed. This also saves memory and boosts speed. Another trick that the optimizer did, was that if a JSR was followed by an RTS the JSR would be converted into a JMP and the RTS not compiled --- this also saves memory and boosts speed. All in all, ISYS Forth was extremely fast! :-)

I wrote a cross-compiler that ran on an MS-DOS machine and generated 65c02 machine-code. My cross-compiler was based on ISYS Forth and used a lot of the ISYS Forth code (all of the floating-point functions, etc.). I had an RS232 cable connecting the MS-DOS machine with the Apple-IIc (actually, I had a Laser-128 that was compatible with the Apple-IIc). I had source-level debugging. I could compile a BRK between every Forth word. I also built up a database of all the BRK instructions, what there address was, and what the corresponding location in the source-code was. I could single-step through the 65c02 program running on the Apple-IIc and see a cursor (a little smiley-face) move through the source-code on the MS-DOS machine. I could also see the contents of the parameter-stack and return-stack updated in a window on the MS-DOS machine. i had break-points and watch-points so I could run the 65c02 program fast and only stop when needed. My development system was loosely modeled after Turbo-Debugger and Turbo-C that I was familiar with, but was for Apple-IIc Forth.

I used my cross-compiler to write a program to do symbolic mathematics. I could take an equation and figure out the derivative for it. I could also simplify equations. I could display equations on the Apple-IIc using my own character set that included all the Greek letters. My goal was to do integration, but I never got that far. I was running out of memory on the Apple-IIc. Also, the derivatives took several seconds to run. I was concerned that integrals would take minutes to run as they are much more complicated. I still have the program and might port it over to 64-bit x86 assembly-language on a modern computer. I don't think there is a market though --- very few people care about calculus --- the ones who do care can also do derivatives and integrals using pencil-and-paper quite quickly, and they don't need a computer to do it for them.

Anyway, that is my experience with Forth on the 65c02 --- I was not aware that anybody else was interested in the subject --- now Michael Barry says that he is, and that there are others as well, so "hello" to all of you!


Top
 Profile  
Reply with quote  
PostPosted: Fri Jun 03, 2016 7:14 am 
Offline
User avatar

Joined: Fri Aug 30, 2002 1:09 am
Posts: 8433
Location: Southern California
Welcome!

I have not heard of ISYS Forth. The Apple IIc looks like an awfully attractive little computer for its day, but I never worked with one. I avoided the pre-INX thing you mention, and although I still use Forth regularly, it was so long ago that I addressed those innards that I can't remember if it was for reasons of interrupt servicing or something else. My '816 ITC Forth kernel allows installing, prioritizing, and deleting, on the fly, up to eight assembly-language ISRs and eight Forth ones. In my '02 Forth, I also have both types, but the 6502's more limited instruction set and lesser efficiency in handling 16-bit quantities made it not worth going quite to the extent I did on the '816. I have an article on servicing interrupts on the '02 in high-level Forth with no overhead, at http://wilsonminesco.com/0-overhead_Forth_interrupts/ . I think it was the first article I ever wrote, in the early 1990's. My applications are heavily dependent on interrupts, often tens of thousands per second (those are serviced in assembly, not Forth), and in the extreme case, something like 140,000 per second. The new computer I'm building will be capable of a lot more than the current one.

I also avoided splitting the stack, so you could use (zp,X) in words that use that all the time, like @ ! C@ C! 2@ 2! TOGGLE +! -! ON OFF C_OFF PERFORM INCR DECR and others, without having to move TOS to another ZP byte pair first in order to use the address. Our (zp,X), high-level languages, and processor design topic discusses this. On the 65816, keeping the bytes together would be all the more important, for taking advantage of its ability to handle 16-bit data quantities (not just addresses) in a single instruction if the bytes are together (in normal low byte high byte order). So for example adding TOS is a matter of CLC, ADC 0,X which takes care of all 16 bits at once.

I have long been interested in doing a STC Forth some day, but I doubt the priority will never get high enough to actually do it. My '816 Forth runs two to three times as fast as my '02 Forth at a given clock speed (both ITC), partly because the 816's greater efficiency at handling 16 bits at once, and its better instruction set, made it practical to write a lot more words as primitives, words that would have become too long on the '02 as primitives, so they had to be written as secondaries (colon definitions) on the '02. On the '816, many were even shorter (not just faster) as primitives. When you have hundreds of primitives, part of the speed advantage comes by virtue of the fact that you no longer need to run NEXT nest and unnest so many times to get a job done.

The links page of my website has a (mostly-6502) Forth section, at http://wilsonminesco.com/links.html#Forth .

For math performance [Edit, later the fact that you said "symbolic" sank in], my new computer will be using the large look-up tables for ultra fast scaled-integer (notice I did not say "fixed-point" which is a limited subset of scaled-integer) math. The front page of that section of my website is at http://wilsonminesco.com/16bitMathTables/ . There's no interpolation, as every single answer is there, pre-calculated, accurate to all 16 bits (or 32 bits in the case of inverses which are there for division, so you can use another table to speed up multiplication, and multiply by the inverse). In some cases (like log functions), looking up the answer can be nearly a thousand times as fast as having to actually calculate it. My HP-71B hand-held computer worked for weeks to build the tables and put them in Intel Hex format. They're there on the website, available for download.

_________________
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?


Top
 Profile  
Reply with quote  
PostPosted: Fri Jun 03, 2016 7:41 am 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10800
Location: England
Welcome, Hugh! Very good to hear of your symbolic maths program - that's proper software! I like the peephole possibilities too. Doubtless others with implementation experience will have interesting things to say.


Top
 Profile  
Reply with quote  
PostPosted: Fri Jun 03, 2016 1:57 pm 
Offline

Joined: Fri Jun 03, 2016 3:42 am
Posts: 158
GARTHWILSON wrote:
The Apple IIc looks like an awfully attractive little computer for its day, but I never worked with one.

I never owned one either --- those Apples were grossly expensive! --- in that era I owned a Commodore-64 that had better features and lower cost (the Atari-800 also had better features and lower cost, but I only had a C64).

I bought the Laser-128 quite late in the era when the Apple-IIc was already pretty much obsolete --- that was not a good idea, as I was putting a lot of work into the symbolic-math program, but the computer was obsolete --- I should have just written the program for MS-DOS.

GARTHWILSON wrote:
I also avoided splitting the stack, so you could use (zp,X) in words that use that all the time, like @ ! C@ C!

ISYS would put the pointer into a ZP pair and use (zp),Y for @ ! etc., but for C@ C! etc. it would move the low-byte to just under the high-byte and use (zp,X) instead.

In my experience, the (zp,X) addressing mode was rarely used (not in Forth or in any software) --- the 6502 would have been better off without that, and instead to use the chip resources for something else.

GARTHWILSON wrote:
I avoided the pre-INX thing you mention, and although I still use Forth regularly, it was so long ago that I addressed those innards that I can't remember if it was for reasons of interrupt servicing or something else.

Yes --- if you are going to have ISRs written in Forth, then you can't do that --- I just wrote ISRs in assembly-language.

GARTHWILSON wrote:
The new computer I'm building will be capable of a lot more than the current one.

You are building a 65c816 computer?

I was interested in the 65c02 and 65c816 back in the day --- there is nostalgia now for me to reminisce on the subject --- but I don't really get why people would be actively working on those processors now, as the modern processors are much better.

I hope this isn't a heresy that will get me flamed, but, why not just focus on the PIC24 or the ARM Cortex? Why hang onto an old 8-bit processor?

The only use I can think of for the 65c02 would be to put it in an FPGA. You could get a pretty inexpensive system --- but that would only be interesting to somebody who is building systems in high volume and needs to count every penny, and who also has pretty low requirements for speed (maybe an appliance such as a clothes washing machine or whatever).

I can't think of any use for the 65c816. It is designed to address a lot of memory --- the advantage of FPGAs however is that they have internal memory (8KB to 64KB) --- if you are going to have a lot of external memory however, then you would be better off with an ARM Cortex rather than a 65c816 in an FPGA or a 65c816 ASIC.


Top
 Profile  
Reply with quote  
PostPosted: Fri Jun 03, 2016 2:15 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10800
Location: England
Note that you're posting on a 6502 forum, Hugh! Most of us are here for sentimental reasons, others for the interesting challenges of working with an 8 bit CPU, and some because we actually find it a practical solution for some problem. We are of course all aware of PICs and ARMs and the like, and if you look around here you'll find a few mixed-mode projects and more than a few emulation projects which make use of those other CPUs.

The '816 is quite a hot topic here too. There are those who can't quite see the advantage of the extra features, and those who can. As this CPU does a very good job of acting as a 6502 from power up until you select native mode, some here feel it's a fine default choice, whether or not you use the extra features. I'm sure I've itemised the advantages in a previous post, but I might let you find that... if you do, please link it!


Top
 Profile  
Reply with quote  
PostPosted: Fri Jun 03, 2016 5:56 pm 
Offline
User avatar

Joined: Fri Aug 30, 2002 1:09 am
Posts: 8433
Location: Southern California
Hugh Aguilar wrote:
I don't really get why people would be actively working on those processors now, as the modern processors are much better.

I hope this isn't a heresy that will get me flamed, but, why not just focus on the PIC24 or the ARM Cortex? Why hang onto an old 8-bit processor?

Have a look at our 10-year topic "Is there any future of 6502?"

It doesn't hurt that the 6502 is possibly the most-documented processor in history, or that as someone on the forum said, "65x02 is wonderful. Programming 65x02 feels like vacation; you cannot get tired doing that. It is natural, logical, it doesn't give you headache."

Quote:
The only use I can think of for the 65c02 would be to put it in an FPGA. You could get a pretty inexpensive system --- but that would only be interesting to somebody who is building systems in high volume and needs to count every penny, and who also has pretty low requirements for speed (maybe an appliance such as a clothes washing machine or whatever).

Actually, appliances is one of the areas 65c02's are going into today, along with automotive, industrial, toy, and even life-support equipment. The fastest ones are running over 200MHz. WDC (the IP holder) estimates that if the '02 were made in the most modern deep-submicron geometries, it would run at 10GHz. In reality though, a lot of applications simply have no use for hotter performance than off-the-shelf 6502's deliver. Companies find it attractive for new embedded controller designs because of the small number of gates, small licensing fee, and, according to Windbond, easy development. They're rather invisible though, controlling processes inside custom ICs that don't say "6502" on the outside.

The podcast here is an Aug 2015 interview with Bill Mensch, pres. of WDC who holds the IP for the 65c02 and 65816, regarding these processors and comparing them to ARM, 68000, x86, 6800, 6501, etc., and his business model and his goals. He obviously has a very clear vision, and he is accomplishing what he wants. Still. Today. In the March 2015 podcast interview with WDC's David Cramer, WDC's VP of business development, he says there are hundreds of different products being made today with 65xx processors in them.

Quote:
I can't think of any use for the 65c816. It is designed to address a lot of memory --- the advantage of FPGAs however is that they have internal memory (8KB to 64KB) --- if you are going to have a lot of external memory however, then you would be better off with an ARM Cortex rather than a 65c816 in an FPGA or a 65c816 ASIC.

The '816 is a natural extension to the '02, so if you're already familiar with the '02, you can pretty easily understand the whole '816 computer without being a computer engineer. The bus structure is almost as simple, and the instruction set makes it easy to envision solutions. This obviously implies assembly language, and some will say it's foolish to use assembly today; but I disagree (although I don't limit myself to assembly language), and have an article, "Assembly Language: Still Relevant Today (No, it won't ever be dead.)"

I'm planning 12MB of 10ns RAM for my new computer. It's in a small card cage though, and the CPU board could later be swapped out. I'm strongly considering having more than one computer in the cage anyway.

_________________
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?


Top
 Profile  
Reply with quote  
PostPosted: Fri Jun 03, 2016 8:21 pm 
Offline
User avatar

Joined: Thu May 28, 2009 9:46 pm
Posts: 8183
Location: Midwestern USA
Hugh Aguilar wrote:
I can't think of any use for the 65c816.

I can. In fact, I designed and built a custom machine controller around the 65C816. It evolved into my POC V1.1 unit.

_________________
x86?  We ain't got no x86.  We don't NEED no stinking x86!


Top
 Profile  
Reply with quote  
PostPosted: Sat Jun 04, 2016 4:26 am 
Offline

Joined: Fri Jun 03, 2016 3:42 am
Posts: 158
BigEd wrote:
Note that you're posting on a 6502 forum, Hugh! Most of us are here for sentimental reasons, others for the interesting challenges of working with an 8 bit CPU, and some because we actually find it a practical solution for some problem. We are of course all aware of PICs and ARMs and the like, and if you look around here you'll find a few mixed-mode projects and more than a few emulation projects which make use of those other CPUs.

The '816 is quite a hot topic here too. There are those who can't quite see the advantage of the extra features, and those who can. As this CPU does a very good job of acting as a 6502 from power up until you select native mode, some here feel it's a fine default choice, whether or not you use the extra features. I'm sure I've itemised the advantages in a previous post, but I might let you find that... if you do, please link it!


Well, I didn't like the 65c816 when it came out, and I still don't like it --- it was obviously designed to support C, but I don't like C and prefer Forth.

As for the 65c02, that was a great processor! That symbolic-math program that I mentioned was possible on the 65c02 because I used my own Forth cross-compiler --- it would not have been possible in C given any commercial C cross-compiler --- the 65c02 is just a better fit for Forth than for C.

I think the 65c02 has a future. Specifically, what I would like to see is a multi-core system --- like the Parallax Propeller, but with the 65c02 as each core processor --- I would rather have a 10-core system with each 65c02 running at 10Mhz. than a single-core system running at 200Mhz..

The 65c02 does need some extensions. There was a version of the 6502 that had instructions for setting individual bits, and that was a pretty good extension. Also, there was a version of the 6502 that had a multiplication instruction, and that was a pretty good extension. It is a mistake to get carried away with nostalgia and limit oneself to the 65c02 of 30 years ago --- it is better to have an upgraded modern processor, which is yet based on the 65c02.


Top
 Profile  
Reply with quote  
PostPosted: Sat Jun 04, 2016 5:23 am 
Offline
User avatar

Joined: Fri Dec 11, 2009 3:50 pm
Posts: 3354
Location: Ontario, Canada
Welcome, Hugh! Always nice to have another Forth-head in our midst. :)

Hugh Aguilar wrote:
the 65c02 is just a better fit for Forth than for C.
I'm with ya on that. But you wouldn't agree that the '816 is a better fit for Forth than for C? I'm confused.

Quote:
The 65c02 does need some extensions. [...] there was a version of the 6502 that had a multiplication instruction, and that was a pretty good extension.
Hmmm, one of the Mitsubishi 740 family, perhaps? [ah yes. PDF attached.]

BTW, how would you like a 'c02 with Forth extensions? (Not to be taken seriously nowadays, but something of a novelty decades ago when it was created! :mrgreen: )

cheers
Jeff

EDIT: a newer version (2006) of this document is posted here


Attachments:
740 Family Software Manual.pdf [574.99 KiB]
Downloaded 265 times

_________________
In 1988 my 65C02 got six new registers and 44 new full-speed instructions!
https://laughtonelectronics.com/Arcana/ ... mmary.html


Last edited by Dr Jefyll on Fri Dec 02, 2016 3:23 am, edited 2 times in total.
Top
 Profile  
Reply with quote  
PostPosted: Sat Jun 04, 2016 5:32 am 
Offline
User avatar

Joined: Fri Aug 30, 2002 1:09 am
Posts: 8433
Location: Southern California
Hugh Aguilar wrote:
Well, I didn't like the 65c816 when it came out, and I still don't like it --- it was obviously designed to support C, but I don't like C and prefer Forth.

Same here, regarding C and Forth. The '816 works well at Forth—much better than the '02. There are a couple of things we keep seeing people disliking (or thinking they dislike) about the '816:

  • the requirement of the bank address to be latched externally. I do wish they had just gone to a 48-pin DIP to start (and the next size up from 44 in PLCC and PQFP) and brought the 8 extra address lines out; but keep in mind that if a 64K space is enough, you don't have to latch, decode, or use the high 8 address bits. Actually, the 65802 was an '816 that could be put into an '02 socket, and then you still get tons of benefits.

  • the mode bits.
    • First of course is the emulation mode. I (and I think BDD and others too) just put it in native mode in the reset routine and never touch it again. I know you can't do that for something like the Apple IIgs that must also run 6502 code and switch back and forth between '816 and '02 programs, but I don't have a IIgs.

    • then there's the bit to control whether A is 8- or 16-bit, and the bit to control whether the index registers are 8- or 16-bit. In my '816 Forth, I keep A in 16-bit mode, and X and Y in 8-bit mode, nearly full time, and rarely change them. When I do, I use my macros ACCUM_8, ACCUM_16, INDEX_8, and INDEX_16, which are far more intuitive than the cryptic REP and SEP instructions, although they assemble the same thing of course.

The '816 running in 6502-emulation mode still gives a lot of new instructions and addressing modes; so you can start using it just like an '02 and then gradually start taking advantage of the new additions as you learn them.

Quote:
the 65c02 is just a better fit for Forth than for C.

It's almost like they had Forth in mind when they designed it, except that NEXT could have been more efficient. The 6502 ran FIG Forth 25% faster than the 6800 at the same clock speed.

Quote:
I think the 65c02 has a future. Specifically, what I would like to see is a multi-core system --- like the Parallax Propeller, but with the 65c02 as each core processor --- I would rather have a 10-core system with each 65c02 running at 10Mhz. than a single-core system running at 200Mhz..

I have a friend who's a programmer at JPL who said basically the same thing. The last time I saw him was at a birthday party years ago, and he said he would like to see the processors kept more simple and more parallelism applied. The situation wasn't conducive to having him discuss that at length, and I seldom see him anymore. I did email asking him to tell me more when he got a chance, but I guess he's not a writer. The newest interest of Chuck Moore (the inventor of Forth who turns 78 this year, 2016) seems to be GreenArrays where he has 144 stack processors on a single IC, for massively parallel embedded applications. Each of the processors on the IC runs Forth instructions in as little as 1.4ns, putting equivalent peak performance at 100GIPS, or 100,000,000,000 Forth instructions per second. We had a topic here 8-12 years ago that might interest you, "Parallel Processing with 6502s."

Quote:
The 65c02 does need some extensions. There was a version of the 6502 that had instructions for setting individual bits, and that was a pretty good extension.

It sounds like you're referring to the SMB and RMB instructions, and they also go with the BBS and BBR instructions which can examine a bit and branch on the condition you specify, all in one instruction, without affecting A, X, or Y. All the 65c02's being produced today have those. Unfortunately they're only for ZP. They were apparently made for '02-based microcontrollers that had their I/O in ZP, which is not the case with most designs that have the I/O separate from the processor. These instructions are the only ones that are left out of the '816, as it needed the space in the op code table which is full with things that are more valuable.

Quote:
Also, there was a version of the 6502 that had a multiplication instruction, and that was a pretty good extension.

The '816 computer I'm making will use the 2MB of large look-up tables for ultra-fast 16-bit scaled-integer (notice I did not say "fixed-point" which is a limited subset of scaled-integer) math where no interpolation is needed because every answer is there, pre-calculated, accurate to all 16 bits. In the extreme cases like trig and log functions, looking up a value can be nearly a thousand times as fast as actually having to calculate it, and it will be accurate to all 16 bits.

Even without the tables though, the '816 compared favorably to the 68000 and 80286 in the Sieve of Eratosthenes benchmark, all running at 8MHz.

I've brought a lot of products to market using PIC16's and one using a 65c02 (that one in 1993, but we sold it for 13 years), and never needed a multiply in any of them. They all had a lot of boolean functions and additions and subtraction, and not much else, except that the last one did have a division routine. The '02 was originally designed for embedded control where a multiplication would not be needed enough to justify the amount of silicon real estate it would have taken.

Quote:
It is a mistake to get carried away with nostalgia and limit oneself to the 65c02 of 30 years ago --- it is better to have an upgraded modern processor, which is yet based on the 65c02.

So...you're talking about the ARM. Bill Mensch himself did some consulting to help the ARM designers.

_________________
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?


Top
 Profile  
Reply with quote  
PostPosted: Sat Jun 04, 2016 8:54 am 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10800
Location: England
> Bill Mensch himself did some consulting to help the ARM designers.
That's a new take on the story usually told! The Acorn people said they visited WDC and mainly noted how very small an outfit it was, and so concluded that if they could make a novel micro, so could Acorn. That said, I'm sure they could have learnt something about the internal workings of the 02 or 816 implementations.

> what I would like to see is a multi-core system
That could indeed be very interesting. The '02 in HDL is so small compared to a modern cheap FPGA that it's easily possible to put several cores on a chip. We've had several multicore discussions in the past - here's an idea of mine:
viewtopic.php?f=4&t=3080


Top
 Profile  
Reply with quote  
PostPosted: Sat Jun 04, 2016 5:10 pm 
Offline

Joined: Fri Jun 03, 2016 3:42 am
Posts: 158
GARTHWILSON wrote:
The '816 running in 6502-emulation mode still gives a lot of new instructions and addressing modes; so you can start using it just like an '02 and then gradually start taking advantage of the new additions as you learn them.

What the 65c02 desperately needed was a (zp,x),y addressing-mode --- the 65c816 failed to provide this, so I considered the 65c816 to be no good.

The 65c816 did provide an (offset,s),y addressing-mode --- that was provided to support the C local-frame --- this isn't very useful in Forth because, although we may have local variables on the return-stack (this would mean that we can't have >R etc.), most of the work is done on the parameter-stack using the zp,x addressing-mode (and because there is no (zp,x),y addressing mode all pointers have to be moved to a zp-pair which is the bottleneck).

All in all, the 65c816 was a big failure. WDC was trying to support C, which means that they were competing against the MC68000, and so the 65c816 was always considered to be a poor-man's MC68000, which isn't the best way to dominate the market.

What Bill Mensch said at the time, was that the 65c816 had low power consumption compared to the MC68000, and hence it was a better choice for a micro-controller than the MC68000 (he specifically suggested its use in controlling bionic body-parts in the human body). This is a laudable goal. His mistake was to try to support C with the (offset,s),y addressing mode --- he should have stuck with the 65c02 design but upgraded it with the (zp,x),y addressing-mode --- he might have had something worthwhile then!

GARTHWILSON wrote:
Quote:
The 65c02 does need some extensions. There was a version of the 6502 that had instructions for setting individual bits, and that was a pretty good extension.

It sounds like you're referring to the SMB and RMB instructions, and they also go with the BBS and BBR instructions which can examine a bit and branch on the condition you specify, all in one instruction, without affecting A, X, or Y. All the 65c02's being produced today have those. Unfortunately they're only for ZP. They were apparently made for '02-based microcontrollers that had their I/O in ZP, which is not the case with most designs that have the I/O separate from the processor. These instructions are the only ones that are left out of the '816, as it needed the space in the op code table which is full with things that are more valuable.

Another failure of the 65c816! Everybody in the 6502 community considered these to be a great extension --- but they weren't provided in the 65c816 --- wtf???

I don't see a problem with these only working in zp. An FPGA processor will have its I/O built-in and can put the I/O ports anywhere in memory --- zp is the best place for the I/O ports because zp addressing is faster than extended addressing --- for example, the 6510 used in the Commodore-64 had two I/O ports at the bottom of zp (used in the Commodore-64 for memory-bank switching).

GARTHWILSON wrote:
Quote:
Also, there was a version of the 6502 that had a multiplication instruction, and that was a pretty good extension.

The '816 computer I'm making will use the 2MB of large look-up tables for ultra-fast 16-bit scaled-integer (notice I did not say "fixed-point" which is a limited subset of scaled-integer) math where no interpolation is needed because every answer is there, pre-calculated, accurate to all 16 bits. In the extreme cases like trig and log functions, looking up a value can be nearly a thousand times as fast as actually having to calculate it, and it will be accurate to all 16 bits.

Even without the tables though, the '816 compared favorably to the 68000 and 80286 in the Sieve of Isosthenes benchmark, all running at 8MHz.

I've brought a lot of products to market using PIC16's and one using a 65c02 (that one in 1993, but we sold it for 13 years), and never needed a multiply in any of them. They all had a lot of boolean functions and additions and subtraction, and not much else, except that the last one did have a division routine. The '02 was originally designed for embedded control where a multiplication would not be needed enough to justify the amount of silicon real estate it would have taken.

Multiplication is primarily needed for motion-control --- that can't be done with table look-up.

BTW: I'm familiar with the PIC16. That thing was a PITA to program because everything was going through the W register, and it had a funky memory-bank system. The 65c02 was much easier to program, but it didn't run very fast. The PIC24 is easier than both though, as it has 20 16-bit registers and orthogonal addressing-modes --- it is really the best 16-bit processor available.

My own professional experience with Forth was working at Testra. They had a motion-control board based on the 80c320 that was used in a laser-etcher. This is pretty demanding because the laser (actually the mirror that directs the laser) has to move at a steady speed whether it is drawing a straight line or a zig-zag line. If there is any delay at the turns, the laser will burn a blotch into the material at that spot. The 80c320 was being pushed to its limit. Testra built their own processor called the MiniForth which was built on the Lattice isp-1048 PLD. I wrote the assembler/simulator/compiler which I called MFX (MiniForth Cross-compiler). I was told at this time that multiplication was the bottleneck in motion-control.

It seems unlikely that a 65c02 system, even given an 8x8 multiplication, would be capable of such high-speed (you need 16x16 multiplication in hardware). Today, I would go with the PIC24 for something like that. At that time (1994/1995) however, the MiniForth was the solution. The competition was using an MC68000 and Testra's MiniForth was significantly better, driving them out of the market. After I left Testra they upgraded from the Lattice PLD to an FPGA and changed the name from MiniForth to RACE (described here: http://www.testra.com/Forth/RACE.htm).

I do think that the 65c02 has a future --- but high-speed motion-control is beyond its ken --- the 65c02 is going to be used for more pedestrian applications, and low-cost will be its primary feature.

GARTHWILSON wrote:
Quote:
It is a mistake to get carried away with nostalgia and limit oneself to the 65c02 of 30 years ago --- it is better to have an upgraded modern processor, which is yet based on the 65c02.

So...you're talking about the ARM. Bill Mensch himself did some consulting to help the ARM designers.


I never heard about this happening.

I have a book on the ARM that provided a history. The BBC Acorn (an educational computer) was based on the 65c02. They wanted to upgrade, but they needed something with very fast ISRs, so the MC68000 with all of its registers was a bad choice (it takes too long to save and restore 16 registers). They considered the 65c816 as the logical upgrade from the 65c02 and they built a prototype, but they found that the 65c816 had problems so they abandoned it. They then built their own processor the ARM (Acorn RISC Machine) and used it. Later on the ARM got sold to an American company and the name changed to (Advanced RISC Machine) --- its primary feature is fast ISRS because it has those shadow-banks of registers --- it also had quite low power consumption (much lower than the MC68000).

Anyway, the 65c816 was a big failure. It was used in the Apple-IIgs which was always considered to be a poor-man's Mac and hence didn't last long. The only real success of the 65c816 was the Super Nintendo. I remember in the late 1990s that WDC had a bad reputation because they focused entirely on supporting Nintendo which was their cash cow, and they didn't provide much support to their other customers. Their attitude was: "Unless you are buying chips by the truckload then we don't have time to talk to you." Then the Super Nintendo became obsolete, and the 65c816 did too.


Top
 Profile  
Reply with quote  
PostPosted: Sat Jun 04, 2016 5:14 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10800
Location: England
(Hmm, your ARM story seems a bit scrambled to me...)


Top
 Profile  
Reply with quote  
PostPosted: Sat Jun 04, 2016 9:37 pm 
Offline
User avatar

Joined: Fri Dec 11, 2009 3:50 pm
Posts: 3354
Location: Ontario, Canada
Hugh Aguilar wrote:
Are any of you familiar with ISYS Forth for the Apple-IIc? About 25 years ago I was programming in it. It is an STC Forth.
I'm with Garth; IOW, intrigued with STC, but haven't gotten around to playing with it yet. No doubt there's some serious speed potential there! 8) Is ISYS your own creation, Hugh? You mentioned the cross-compiler is.

I'm lukewarm on the split stack thing (if the cells are only 16-bit, I mean). I think somewhere around here we have a thread on split stacks. Anyway, I'm happy to agree that the split stack idea does reduce the overhead for INX's and DEX's of the p-stack pointer.

As for the thing about the peephole optimizer doing liposuction, :mrgreen: it's clever -- I like it. But it seems noteworthy that the INX-DEX problem is getting attacked from two different angles. (The peephole and the split stack both reduce INX-DEX overhead.) I'm sold on the peephole deal, but I expect there'd be diminishing returns if the split stack were added as well. Is the split stack worth the cost paid by fetch and store? Keeping those guys fast is pretty important, too.

Quote:
What the 65c02 desperately needed was a (zp,x),y addressing-mode
You mean indexed before and after the indirection? Absolutely! And at least one 65c02 was extended to support this! Great minds think alike! (or something like that... :) )

-- Jeff

_________________
In 1988 my 65C02 got six new registers and 44 new full-speed instructions!
https://laughtonelectronics.com/Arcana/ ... mmary.html


Top
 Profile  
Reply with quote  
PostPosted: Sun Jun 05, 2016 12:05 am 
Offline
User avatar

Joined: Fri Aug 30, 2002 1:09 am
Posts: 8433
Location: Southern California
Hugh Aguilar wrote:
What the 65c02 desperately needed was a (zp,x),y addressing-mode --- the 65c816 failed to provide this, so I considered the 65c816 to be no good.

I do seem to have a vague memory of wishing there were something like that once; but since it was a rare and long-ago wish, I wouldn't call it a "desparate" need. Can you give examples of what you would like it for? The '816 took care of some of the needs by offering 16-bit index registers.

Quote:
The 65c816 did provide an (offset,s),y addressing-mode --- that was provided to support the C local-frame --- this isn't very useful in Forth

Agreed. I think I only used it once in my '816 Forth.

Quote:
because, although we may have local variables on the return-stack (this would mean that we can't have >R etc.), most of the work is done on the parameter-stack using the zp,x addressing-mode (and because there is no (zp,x),y addressing mode all pointers have to be moved to a zp-pair which is the bottleneck).

Moving pointers to a ZP pair is something you have to do if you split the stack, which I mentioned earlier as a reason not to do it, as so many stack cells are addresses for things like @ and !.

Quote:
All in all, the 65c816 was a big failure. WDC was trying to support C, which means that they were competing against the MC68000, and so the 65c816 was always considered to be a poor-man's MC68000, which isn't the best way to dominate the market.

My understanding is that the 68000 was much better suited for C but that otherwise it was not a very good performer, even compared to the '816.

Quote:
GARTHWILSON wrote:
Quote:
The 65c02 does need some extensions. There was a version of the 6502 that had instructions for setting individual bits, and that was a pretty good extension.

It sounds like you're referring to the SMB and RMB instructions, and they also go with the BBS and BBR instructions which can examine a bit and branch on the condition you specify, all in one instruction, without affecting A, X, or Y. All the 65c02's being produced today have those. Unfortunately they're only for ZP. They were apparently made for '02-based microcontrollers that had their I/O in ZP, which is not the case with most designs that have the I/O separate from the processor. These instructions are the only ones that are left out of the '816, as it needed the space in the op code table which is full with things that are more valuable.

Another failure of the 65c816! Everybody in the 6502 community considered these to be a great extension --- but they weren't provided in the 65c816 --- wtf???

Everybody?? I never used them, because they were limited to ZP, and I never had my I/O in ZP. Jeff Laughton (forum name Dr Jefyll) has even faster I/O for the 65c02, trapping single-cycle unused op codes, at http://wilsonminesco.com/6502primer/potpourri.html#Jeff . Rather than use SMB with 5 cycles for example, you can do it in a single cycle with his techniques, ie five times as fast. The '816 does not offer any single-cycle unused op codes, but the 2-cycle WDM instruction with its one-byte operand leaves more possibilities, which he's developing at the moment. They're not memory-mapped, so the contents of the direct-page and bank registers are irrelevant. It does require external logic, but keep reading:

Quote:
I don't see a problem with these only working in zp. An FPGA processor will have its I/O built-in and can put the I/O ports anywhere in memory --- zp is the best place for the I/O ports because zp addressing is faster than extended addressing

True—although if you're going to use programmable logic, Jeff's improvements become much faster I/O than using I/O ICs in ZP with standard instructions.

Quote:
--- for example, the 6510 used in the Commodore-64 had two I/O ports at the bottom of zp (used in the Commodore-64 for memory-bank switching).

and I just found out the 6502 Atari computers commonly used a banking scheme that allowed access to a megabyte of memory through a 16K window at $4000-7FFF. (I don't know where they put the bank-selection output though.) This came up because of Jonathan Halliday's impressive progress on his preemptive multitasking GUI OS for Atari 6502 computers, shown at https://www.youtube.com/watch?v=T14dL9M ... e=youtu.be where he runs it at 1.79MHz. (I mention the speed because someone was sure it was on a much faster emulator; but no, he says it's 1.79MHz.) The Wikipedia article on ARM says the 6502 was not fast enough to do a GUI; but Jon is proving them wrong, doing not only a GUI, but preemptive multitasking (although of course if he had to do it with 1600 dots across the monitor, there's be a problem).

Quote:
Multiplication is primarily needed for motion-control --- that can't be done with table look-up.

8-by-8-bit can be looked up directly, because there are 65,536 16-bit answers in the multiplication table. For more than 8 bits, you can still use the table to speed up the process, although it will no longer be single-step.

Quote:
BTW: I'm familiar with the PIC16. That thing was a PITA to program because everything was going through the W register, and it had a funky memory-bank system. The 65c02 was much easier to program, but it didn't run very fast. The PIC24 is easier than both though, as it has 20 16-bit registers and orthogonal addressing-modes --- it is really the best 16-bit processor available.

Yep, the PIC16's processor is a pain in every respect, and I can make a long list. I have used it for every reason except that decrepit processor Microchip used to brag about. The microcontroller family was inexpensive, came with loads of onboard microprocessor support (power-up timer, brown-out reset, watchdog, etc.), had lots of variations of I/O modules, was programmable on the workbench (unlike the mask-programmed 65134 and 65265), and were available in any quantity, in stock at many distributors. Those, and not the processor itself, are why I used them. Over the years, I developed a lot of macros that hid a lot of the ugly internal details so I wouldn't curse it so much. Its performance is not particularly good though. For the most part, I found that it took the PIC twice as many instructions, and twice as many clocks, to do a job as the 6502—if the PIC could do it at all; and they both run at 20MHz, so that means the 6502 is generally about twice as fast.

Quote:
Testra built their own processor called the MiniForth which was built on the Lattice isp-1048 PLD. I wrote the assembler/simulator/compiler which I called MFX (MiniForth Cross-compiler). I was told at this time that multiplication was the bottleneck in motion-control.

That sounds very attractive. Were other Forth processors considered before making that decision, like the RTX2000 or SC32?

Quote:
I do think that the 65c02 has a future --- but high-speed motion-control is beyond its ken --- the 65c02 is going to be used for more pedestrian applications, and low-cost will be its primary feature.

I expect do it, but it's not what Bill Mensch is interested in. One of his licensees is running a 65c02 at over 200MHz, and Mesch estimates that in a 20nm geometry, it could hit 10GHz. (They expect 10nm by next year, which would further increase that speed.) Bill Mensch however has rejected making WDC a public corporation, and that has both some pluses and some minuses.

Quote:
GARTHWILSON wrote:
Quote:
It is a mistake to get carried away with nostalgia and limit oneself to the 65c02 of 30 years ago --- it is better to have an upgraded modern processor, which is yet based on the 65c02.

So...you're talking about the ARM. Bill Mensch himself did some consulting to help the ARM designers.

I never heard about this happening.

"Consulting" was probably a poor choice of words. It is my understanding that he did not design any part of the ARM, but they visited him and asked his advice about various things, including after they went back home, and he was happy to help, not viewing their ambitions as any threat to his own. One of the interviews where he mentions this is at http://ataripodcast.libsyn.com/antic-in ... -6502-chip . He mentions benchmarks at about 23 minutes into it, and ARM at about 39:30-40:10. Sophie Wilson was amazing. Acorn's CEO at the time, Hermann Hauser, recalls that "while IBM spent months simulating their instruction sets on large mainframes, Sophie did it all in her brain." Now that is a real engineer!

Quote:
and the name changed to (Advanced RISC Machine)

I think Bill Mensch said in the interview linked above that they quit saying it stood for anything, because it wasn't really a RISC.

Quote:
its primary feature is fast ISRS because it has those shadow-banks of registers

How well does it handle nested ISRs, for when you want to allow the servicing of one interrupt to be interrupted by a higher-priority, quick-to-service interrupt, something as simple as IRQ and NMI? Z80 apparently had a second register set to speed up interrupt response, but this was of no value when you nest interrupts.

Quote:
Anyway, the 65c816 was a big failure. It was used in the Apple-IIgs which was always considered to be a poor-man's Mac and hence didn't last long.

Apple intentionally ran it at only 2.8MHz, well below its potential, to keep it from making the 68000 look bad. Apple killed the IIgs.

Quote:
The only real success of the 65c816 was the Super Nintendo. I remember in the late 1990s that WDC had a bad reputation because they focused entirely on supporting Nintendo which was their cash cow, and they didn't provide much support to their other customers. Their attitude was: "Unless you are buying chips by the truckload then we don't have time to talk to you." Then the Super Nintendo became obsolete, and the 65c816 did too.

WDC now makes most of their money licensing the IP. In the mid- to late-90's, they voluntarily (without my requesting them) sent me a couple of 65802's; so I have to wonder what the whole story is about not having to pay attention to anyone who didn't buy a truckload of them.

In 1993 my assignment was to develop the Cadillac of private-aircraft intercoms. It became totally out of the question to control all the features with discrete logic, so in the early stages of development, I considered several different microcontrollers for the job. Many had severe limitations for our application. One manufacturer actually told us that we basically were too small for them to be interested in our business. We settled on one of the Motorola 68HC11's. Unfortunately the auto industry had the version we needed with EEPROM and lots of EPROM all locked up and on allocation, and not easy to buy. As it ended up, we just went back to a discrete 65c02 computer on its own board. Our cost for parts and labor was approximately what the HC11 was going to cost anyway—it just took a little more room. We sold that for 13 years, and never had any trouble getting parts, in spite of our small volume.

The fact remains that my '816 Forth runs two to three times as fast as my '02 Forth at a given clock speed. I'd say that's not bad at all.

_________________
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 34 posts ]  Go to page 1, 2, 3  Next

All times are UTC


Who is online

Users browsing this forum: No registered users and 5 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: