6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Fri Nov 22, 2024 8:41 pm

All times are UTC




Post new topic Reply to topic  [ 57 posts ]  Go to page 1, 2, 3, 4  Next
Author Message
PostPosted: Mon Jun 11, 2012 1:26 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10986
Location: England
C R Bond's calc65 floating point arithmetic and transcendental function package:
http://www.crbond.com/calc65.htm
Uses 8 bytes and BCD to give 12 decimal digits and 3 exponent digits.

Via posting by "Repose" on CSDb forum
http://csdb.dk/forums/?roomid=11&topicid=91763

Edit: attached an archive to help anyone having trouble with the download:
Code:
Archive:  FLTPT65.zip
  Length      Date    Time    Name
---------  ---------- -----   ----
    28234  2012-06-16 19:34   FLTPT65.h65
    49608  2012-06-16 19:34   FLTPT65.h6x
     3021  2012-06-16 19:34   FLTPT65.log
   222436  2012-06-16 19:34   FLTPT65.lst
    44244  2012-06-16 19:34   FLTPT65.xrf
    87112  2008-03-30 02:46   fltpt65.cba
    91247  2008-03-30 02:46   fltpt65.cba.orig
---------                     -------


Attachments:
File comment: source, converted and assembled, including listing from CBA65 assembler (run on Linux in WINE)
FLTPT65.zip [128.33 KiB]
Downloaded 331 times


Last edited by BigEd on Sat Jun 16, 2012 6:41 pm, edited 1 time in total.
Top
 Profile  
Reply with quote  
PostPosted: Mon Jun 11, 2012 7:05 pm 
Offline
User avatar

Joined: Fri Aug 30, 2002 1:09 am
Posts: 8543
Location: Southern California
Thanks for frequently posting this kind of thing. It's always good to pubicize the availability of resources like this. I just added it to my links page.

_________________
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?


Top
 Profile  
Reply with quote  
PostPosted: Tue Jun 12, 2012 12:09 am 
Offline

Joined: Mon Mar 02, 2009 7:27 pm
Posts: 3258
Location: NC, USA
Yes, thanks BigEd and Charles R. Bond..
Excellent write-up by Charles on his CBA65. Hashing mnemonics is the most interesting chapter in the .pdf. I'll read this .pdf many times as I would like to eventually tackle an assembler/disassembler for the 65Org16.b. There are many clues here.

Looks like Bruce used a similar technique for C'mon. Still reading...

_________________
65Org16:https://github.com/ElEctric-EyE/verilog-6502


Top
 Profile  
Reply with quote  
PostPosted: Tue Jun 12, 2012 5:30 am 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10986
Location: England
Ah yes, he's Charles - also Chuck - I couldn't find a name yesterday!

It's worth keeping an eye on "Repose" on that other forum, presently pursuing fast multiplication routines.


Top
 Profile  
Reply with quote  
PostPosted: Tue Jun 12, 2012 6:40 am 
Offline

Joined: Sat Dec 13, 2003 3:37 pm
Posts: 1004
ElEctric_EyE wrote:
Yes, thanks BigEd and Charles R. Bond..
Excellent write-up by Charles on his CBA65. Hashing mnemonics is the most interesting chapter in the .pdf. I'll read this .pdf many times as I would like to eventually tackle an assembler/disassembler for the 65Org16.b. There are many clues here.

I read it, it's a novel idea.

But, really. Maybe if you're going to implement it on "heritage hardware" that kind of optimization will offer some value. It's not particularly space efficient, for example. It's 56 unique mnemonics on a 6502, and they're really short. But save on the slowest hardware I wonder how truly noticeable the benefit of the hashing function is.

I have modern hardware (well, 6 years modern hardware). My assembler assembles the Fig Forth listing, 4000 lines, 104K source file. It reads, does both passes, writes the listing and generates the 18K Intel Hex output in <2 seconds. And this is in routine Java, not a system famous for its lightning fast startup times. It barely has time to JIT anything up.

I think the impact of improved instruction lookup to the overall assembly process is low. If you don't have the memory for a hash, a binary search would do you just fine. Assemblers tend to be dominated by I/O, modern memory sizes are good at fixing that problem.


Top
 Profile  
Reply with quote  
PostPosted: Tue Jun 12, 2012 8:33 am 
Offline

Joined: Mon Mar 02, 2009 7:27 pm
Posts: 3258
Location: NC, USA
BigEd wrote:
Ah yes, he's Charles - also Chuck - I couldn't find a name yesterday!...

You mean Chuck who posts on our forum? Very cool!


Hmmm, Whartung you have a point there with modern systems that have large amounts of RAM. But think of a 'modern system' on a single FPGA. Block memory inside the FPGA is somewhat limited, so any program that is smaller is better. Not to mention most likely faster! We've gotten the best 6502 cores to speeds in excess of 100MHz.

BTW, good luck with your Fig-Forth project. It's interesting to follow your progress.

_________________
65Org16:https://github.com/ElEctric-EyE/verilog-6502


Top
 Profile  
Reply with quote  
PostPosted: Tue Jun 12, 2012 8:58 am 
Offline
User avatar

Joined: Fri Aug 30, 2002 1:09 am
Posts: 8543
Location: Southern California
Quote:
But, really. Maybe if you're going to implement it on "heritage hardware" that kind of optimization will offer some value. It's not particularly space efficient, for example. It's 56 unique mnemonics on a 6502, and they're really short. But save on the slowest hardware I wonder how truly noticeable the benefit of the hashing function is.
[...]
I think the impact of improved instruction lookup to the overall assembly process is low. If you don't have the memory for a hash, a binary search would do you just fine. Assemblers tend to be dominated by I/O, modern memory sizes are good at fixing that problem.

Hashing and trees and caches for improving Forth compilation speed is discussed in the "dictionary hashing" topic under "Forth" at viewtopic.php?f=9&t=555 . The same stuff applies.

_________________
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?


Top
 Profile  
Reply with quote  
PostPosted: Tue Jun 12, 2012 5:09 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10986
Location: England
ElEctric_EyE wrote:
BigEd wrote:
Ah yes, he's Charles - also Chuck - I couldn't find a name yesterday!...

You mean Chuck who posts on our forum?

No, not ChuckT. Just that there's a doc on his site where he refers to himself as Chuck.
Cheers
Ed


Top
 Profile  
Reply with quote  
PostPosted: Wed Jun 13, 2012 2:48 am 
Offline
User avatar

Joined: Thu May 28, 2009 9:46 pm
Posts: 8506
Location: Midwestern USA
ElEctric_EyE wrote:
Yes, thanks BigEd and Charles R. Bond..
Excellent write-up by Charles on his CBA65. Hashing mnemonics is the most interesting chapter in the .pdf. I'll read this .pdf many times as I would like to eventually tackle an assembler/disassembler for the 65Org16.b. There are many clues here.

Looks like Bruce used a similar technique for C'mon. Still reading...

When I was developing the assembler in my POC's ROM I gave some thought to hashing mnemonics using an older technique that I had learned in the 1980s. However, I concluded that it was mostly wasted effort and might not be space efficient, something to consider when working within the confines of an 8 KB ROM. So I decided to reuse a method I had devised for an older 65C02 assembler, which involved three tables, two being the 3:2 encoded form of the mnemonics and the third being a single byte that described the characteristics of the instruction being assembled, such as operand size, addressing mode, etc. The same tables are also consulted to disassemble instructions, so I'm getting double duty from them.

A 65C816 assembler/disassembler is complicated a bit by the fact that immediate mode instructions can have either a one or two byte operand, and other instructions, such as JMPs and JSRs, can have two or three byte operands. After considering both hashing and table lookups, I determined that the table method used no more code space and may actually be faster. From my point of view, hashing would only make sense if trying to assemble on a slow machine, such as a C-64.

_________________
x86?  We ain't got no x86.  We don't NEED no stinking x86!


Last edited by BigDumbDinosaur on Thu Jun 14, 2012 4:49 am, edited 1 time in total.

Top
 Profile  
Reply with quote  
PostPosted: Thu Jun 14, 2012 4:41 am 
Offline

Joined: Sat Dec 13, 2003 3:37 pm
Posts: 1004
GARTHWILSON wrote:
Hashing and trees and caches for improving Forth compilation speed is discussed in the "dictionary hashing" topic under "Forth" ...

Sure, but a Forth has an ever expanding, dynamic vocabulary that can reach in to the hundreds (if not more) of variable length words. Plus the typical word list is fragmented across memory (not an issue with 6502s per se). Finally, Forths tend to (but not necessarily as we all know) be hosted on the target, slow, low resource machine, not cross-compiled off a workstation.

But for a static list of 56, 3 character opcodes -- 168 bytes? I don't think the hashing will have that dramatic of an impact on overall performance of an assembler, especially on a 32 bit machine rated in several MHz. This hash happens to perfectly distribute the opcodes across 255 potential buckets. However, its not extensible. Add a new opcode, and the entire algorithm may need to be redone because it causes a conflict. Who knows.

Back in the day, three of the criteria that C compilers were judged were compiler speed, executable speed, and executable size. But once the PC/AT machines came out, much less the 386s, and we started pushing past 12Mhz, compiler speed fell off the map, mostly because it was dominated by slow I/O, and new faster CPUs with faster hard drives. Folks were more interested in debuggers and IDEs by that point anyway.

But optimizing opcode lookup? On a 32 bit machine?? And long word aligning the source code to make mnemonics fit properly in a long?? He still needs to look up pseudo ops, and possibly macros as well.

Smells of a pre-optimization to me. Sort the list and binary search it and be done if you're memory starved.

Like I said, mundane Java, large source file, <2s including startup and the full listing. No doubt it consumes several MB of RAM during run, but not enough to even kick off the garbage collector before its finished. I can assure you I will not be optimizing this assembler for performance any time soon.


Top
 Profile  
Reply with quote  
PostPosted: Thu Jun 14, 2012 5:10 am 
Offline
User avatar

Joined: Thu May 28, 2009 9:46 pm
Posts: 8506
Location: Midwestern USA
whartung wrote:
GARTHWILSON wrote:
Hashing and trees and caches for improving Forth compilation speed is discussed in the "dictionary hashing" topic under "Forth" ...

Sure, but a Forth has an ever expanding, dynamic vocabulary...But for a static list of 56, 3 character opcodes -- 168 bytes?

I'd agree here. Comparing Forth's vocabulary to an MPU's instruction set is not really valid. The 65C02 has 69 mnemonics and the 65C816 has 92, which would consume 207 and 276 bytes, respectively, if stored in ASCII form. When reduced to a 15 bit sequence, 138 or 184 bytes would be consumed. In either case, as you point out, the count is static and hardly a burden on memory. A static array is easily searched with a simple loop. Hashing is way too much like sending out a battleship to sink a rowboat.

Quote:
I don't think the hashing will have that dramatic of an impact on overall performance of an assembler, especially on a 32 bit machine rated in several MHz. This hash happens to perfectly distribute the opcodes across 255 potential buckets. However, its not extensible. Add a new opcode, and the entire algorithm may need to be redone because it causes a conflict. Who knows.

It isn't likely that any new mnemonics are going to be added to the current crop of 65xx processors. The '816 pretty much pushed that to the limit, as it uses all 256 possible opcodes. The WDM mnemonic is a place-holder for a possible "escape" sequence that would allow for future two byte opcodes. However, considering that the '816 has been around for a long time, as has its designer (he's my age, which is past Social Security full retirement age), I'd be surprised if the WDM instruction ever gets used for anything. In fact, I'd be surprised if WDC develops anything new anytime in the future. Once Bill Mensch decides to call it quits Western Design Center will probably be liquidated and cease to exist.

BTW, you keep saying "opcode" when you mean "mnemonic." Not to be pedantic about it, but they are two different things.

_________________
x86?  We ain't got no x86.  We don't NEED no stinking x86!


Top
 Profile  
Reply with quote  
PostPosted: Thu Jun 14, 2012 5:23 am 
Offline
User avatar

Joined: Fri Aug 30, 2002 1:09 am
Posts: 8543
Location: Southern California
Quote:
Sure, but a Forth has an ever-expanding, dynamic vocabulary that can reach in to the hundreds (if not more) of variable length words.

Yes, my '816 Forth has many hundreds of primitives alone, not including secondaries.

Quote:
Plus the typical word list is fragmented across memory (not an issue with 6502s per se). Finally, Forths tend to (but not necessarily as we all know) be hosted on the target, slow, low resource machine, not cross-compiled off a workstation.

but interestingly that makes them faster for development, because you don't need to re-compile the whole application every time you make a little change and transfer it to flash or whatever and re-start the target, etc.. You can type as little as a single line, or even just a word, and send it to the target and have it act on it immediately without necessarily even pausing a background job it's doing. I've even changed an ISR between interrupts that were coming at over 40,000 per second on my 6502 workbench computer, not pausing the interrupts. (To do that you get the new ISR ready and then change the vector to point to the new one.) Even a fast PC can't do that.

I definitely agree that assemblers and compilers running on modern PCs are plenty fast for what they do. I was just linking to a good discussion on various techniques for speeding up compilation on a not-so-fast machine, where I really liked the cache idea. (I have not implemented it yet though.)

_________________
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?


Top
 Profile  
Reply with quote  
PostPosted: Fri Jun 15, 2012 8:56 am 
Offline

Joined: Mon Mar 02, 2009 7:27 pm
Posts: 3258
Location: NC, USA
GARTHWILSON wrote:
...You can type as little as a single line, or even just a word, and send it to the target and have it act on it immediately without necessarily even pausing a background job it's doing. I've even changed an ISR between interrupts that were coming at over 40,000 per second on my 6502 workbench computer, not pausing the interrupts. (To do that you get the new ISR ready and then change the vector to point to the new one.) Even a fast PC can't do that....

If I may interject. This does sound impressive... But 'fast PC' is a very subjective term. There must be a point where a PC can do that, just for arguments sake.

_________________
65Org16:https://github.com/ElEctric-EyE/verilog-6502


Top
 Profile  
Reply with quote  
PostPosted: Fri Jun 15, 2012 4:21 pm 
Offline
User avatar

Joined: Thu May 28, 2009 9:46 pm
Posts: 8506
Location: Midwestern USA
BigEd wrote:
C R Bond's calc65 floating point arithmetic and transcendental function package:
http://www.crbond.com/calc65.htm
Uses 8 bytes and BCD to give 12 decimal digits and 3 exponent digits.

Have you tried that link? All I get is a screen full of gibberish when I attempt to access the source code.

_________________
x86?  We ain't got no x86.  We don't NEED no stinking x86!


Top
 Profile  
Reply with quote  
PostPosted: Fri Jun 15, 2012 5:09 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10986
Location: England
BigDumbDinosaur wrote:
BigEd wrote:
C R Bond's calc65 floating point arithmetic and transcendental function package:
http://www.crbond.com/calc65.htm
Uses 8 bytes and BCD to give 12 decimal digits and 3 exponent digits.

Have you tried that link? All I get is a screen full of gibberish when I attempt to access the source code.

Yes, it's assembly source. Try saving to disk and viewing as a text file (when viewed in-browser the line-ends are not rendered):

Code:
        .print stats,xref,clip=76,csort=c,cycles
        .files h6x
;   fltpt7.cba -- floating point routines for 650X
;
;   (C) 1999 - 2008, C. Bond. All rights reserved.
;
;   v.1
;   This version includes add, subtract, multiply, divide, square
;   root, tangent and arctangent. The tangent and arctangent are
;   implemented as efficient BCD CORDIC algorithms. Sine, cosine
;   arcsine and arccosine are also provided.


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 57 posts ]  Go to page 1, 2, 3, 4  Next

All times are UTC


Who is online

Users browsing this forum: No registered users and 15 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: