6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Fri Nov 22, 2024 9:54 am

All times are UTC




Post new topic Reply to topic  [ 44 posts ]  Go to page Previous  1, 2, 3
Author Message
PostPosted: Mon Feb 24, 2014 7:47 am 
Offline

Joined: Mon Feb 24, 2014 7:28 am
Posts: 2
RE> I've thought about this some, and come to the conclusion that a 32 bit proc. would be somewhat different than the '02 because it's too expensive to waste 32 bit for an 8 bit opcode. I'm assuming the biggest factor for wanting a 32 bit 6502 is the extended address range and more ops. It can't be performance because there are other architectures that make better use of 32 bits for performance. Having already come up with my own 32 bit incarnation which uses byte opcodes, for a 32 bit opcode I'd suggest the following:

Over on the net news group comp.arch I started a now extensive 65832 thread and wondered if the hardware guys here would be interested in contributing. (I am a software guy.)
https://groups.google.com/forum/#!forum/comp.arch

Basically I think that if you make the Direct Page a register file and make some other changes you could end up with a CPU that would easily be faster than a RISC chip at the same speed. While doing so at less heat/complexity/cost due to the high cost of L1 access with RAW etc hazards compared to Direct Scratch.


Top
 Profile  
Reply with quote  
PostPosted: Mon Feb 24, 2014 10:46 am 
Offline
User avatar

Joined: Fri Aug 30, 2002 1:09 am
Posts: 8543
Location: Southern California
Welcome, ggtgp.

ggtgp wrote:
Over on the net news group comp.arch I started a now extensive 65832 thread and wondered if the hardware guys here would be interested in contributing. (I am a software guy.)
https://groups.google.com/forum/#!forum/comp.arch

That same head post was posted somewhere else too, right? I can't remember where, but I'm sure I've read it.

Since I'm not registered to post there, I'll respond here to things there in the order I find them. You can direct them to this post if you like. (It's perhaps too relaxing to sit here late at night and do this, half-asleep.) Most things here are ones that 6502.org old-timers have heard from me before.

The 65Org32 however has all registers 32 bits wide, including the direct-page register, so it covers the entire 4GW memory span but serves as an offset. The exception might be the status register, as making use of 32 bits of status might take more imagination than I have.

I have no real desire for floating-point hardware myself, having found through experience that nearly everything can be done in fixed-point or scaled-integer more efficiently. The "how" is discussed at the beginning of my web page on "Large look-up tables for hyperfast, accurate 16-Bit scaled-integer math, including trig & log functions." I realize there are a few legitimate applications for floating-point, and I won't hold anything against those who need it, but my own experience says more and more than it is seldom necessary.

The matter of the two stacks mentioned at your link is not a problem at all for 6502. The return stack is the normal hardware stack in page 1, and the data stack goes in page 0, with X as the pointer, taking advantage of the added ZP addressing modes. It's almost like the 6502 was made for it, except that having the 16-bit cells, the performance is not as good with an 8-bit processor as it is with a 16-bit. The 65816 mostly qualifies, although the data bus is still 8-bit. My '816 Forth runs two to three times as fast as my '02 Forth at a given clock speed. The 816's stack-relative addressing is nice for many operations, not just in Forth. I can see a use for two hardware stacks, but for a different reason. Actually, that would get Forth's DTC NEXT down to 6 cyccles.

I'm glad to see Mike is on there, discussing his 65m32. He did write however:
Quote:
When the 6502 needed to operate on 16-bit data and addresses, it had to use about twice as many instructions as the 8-bit versions of the same.

I'd say it's considerably worse than twice. See my example at viewtopic.php?f=9&t=1505&p=9705#p9705 . [Edit: I see he mentioned me! :D ]

Anton wrote:
Quote:
Before me I have the code for fig-Forth's NEXT (the interpreter dispatch loop, performed once for each virtual machine instruction). On the 6502 it's 12 instructions and 39 cycles and on the 6809 it's 2 instruction and 14 cycles. Sure, the 6502 has a lower CPI for this sequence (3.25 vs. 7), but overall it's slower by a factor >2.7, so yes, it was significantly slower clock for clock (and clock rates were similar, so it was also slower in seconds).

The 6809 did have a nice way to do NEXT. The 65816 was much better in that regard than the 6502's also. Note however that the 6502 (and '816 too) has achieved clock rates that are astronomical compared to the 6809's.

Regarding the number of registers: As I'm reading there, 8 is getting talked about a lot. The 65816 already has 9: C (splittable into A and B), X, Y, S, P, DP, PB, DB, and PC, although they're not general-purpose. More general-purpose ones help with compilers, but don't seem to be particularly useful for assembly. As BigEd here observed, "With 6502, I suspect more than one beginner has wondered why they can't do arithmetic or logic operations on X or Y, or struggled to remember which addressing modes use which of the two. And then the intermediate 6502 programmer will be loading and saving X and Y while the expert always seems to have the right values already in place." I do use Forth a lot in my work, and the 6502/816 do it well; but I have also brought the level of my assembly language way, way up, using program-structure macros to incorporate things like IF...ELSE_...END_IF, BEGIN...WHILE...REPEAT, FOR...NEXT, CASE, etc., in assembly, which dramatically improves programmer productivity, while retaining the performance advantage of assembly.

As for using registers to pass arguments, if I ever get my 6502 treatise on stacks finished, it will have a large body of code for passing inputs and outputs on a ZP data stack, without particularly using Forth (although if you want to write a STC Forth with it, much of the work will already be done for you). The data-stack method gets rid of the conflicts with what registers are being used by what and accidentally overwriting something that is still needed.

George wrote:
Quote:
Using DP pseudo-address registers on the 65816 was a problem in itself because there was no 24-bit indirect addressing mode: you had to manually set the data bank register to the high byte of the address [which could be done only through the stack] before doing a 16-bit indirect operation

There were, but the pointer was in DP, like LDA[DP] (op code A7) and LDA[DP],Y (op code B7). Admittedly, there was not every combination and permutation of indirects and indexings though. If you're not doing it often enough for it to hurt performance significantly, make a macro to hide the recurring internal details.

and:
Quote:
The 6502 had only the 256 bytes in bank 1. That was too limiting for most code, so many programs were forced to maintain a software managed stack using a pseudo-register in the zero page.

From the program tips page of my 6502 primer: A common criticism of the 6502 is that the stack space is so limiting. A few higher-level languages (notoriously Pascal) do put very large pieces of data and entire functions and procedures on the stack instead of just their addresses. For most programming though, the 6502's stack is much roomier than you'll ever need. When you know you're accessing the stacks constantly but don't know what the maximum depth is you're using, the tendency is to go overboard and keep upping your estimation, "just to be sure." I did this for years myself, and finally decided to do some tests to find out. I filled the 6502 stack area with a constant value (maybe it was 00-- I don't remember), ran a heavy-ish application with all the interrupts going too, did compiling, assembling, and interpreting while running other things in the background on interrupts, and after awhile looked to see how much of the stack area had been written on. It wasn't really much-- less than 20% of each of page 1 (return stack) and page 0 (data stack). This was in Forth, which makes heavy use of the stacks. The IRQ interrupt handlers were in Forth too, although the software RTC (run off a timer on NMI) was in assembly language.

Quote:
Does 65816 code pay lots of TAX and TAY?

If you mean, "Does 65816 code tend to use the TAX and TAY instructions a lot?" I guess I would say no; but the only real project I've done on the '816 is my '816 Forth. There does need to be a way to do the transfer though.

Brett's advice to get rid of the "garbage op codes" should probably have some explanation as to why, for example that they take too many resources in programmable logic or whatever. I find the ones he has referred to to be very useful. Perhaps he has a different way of doing it in mind, which is quite possible. I do a lot of work with PIC16's though, and even with their incredibly limited instruction sets and pipelining, they still allow incrementing or decrementing memory in a single 4-clock instruction, including for indirects (through INDF).

From John Stavard:
Quote:
Nostalgia?

It isn't a 6502 if it doesn't run every piece of 6502 software ever written. Upwards compatibility is essential if the intent is for the chip to provide powerful capabilities, but also work in an old Apple IIgs or even an old Commodore-64, just adding extra abilities to those machines.

I think most of us have no interest in that. You can't drop it into a 6502 socket if there are 32 data lines and at least 24 (non-multiplexed) address lines; and even if you could, the old hardware won't run at dozens (or hundreds) of MHz. For myself, if a 32-bit '816 becomes available, I'll make a new computer and run all new code on it. As Mike says, the "look and feel" are important, and right there lies the experience investment I want to protect. I can still run the old code on old computers.

and:
Quote:
So basically I agree with you that the Direct Page is the future of computing!

However, I went and looked up the 65816. I see one big problem.

It uses ***ALL 256 OPCODES***.

And so there seems to be no strictly compatible way to add an instruction to switch to the new 32-bit mode. And, also, none of the new 32-bit features will be accessible from 16-bit mode, so much unlike the case where 8-bit Emulation Mode allows access to many of the new features of 16-bit mode!!

No doubt, though, the people thinking of such a project have already figured out a clever way around this problem, and so they don't need me to suggest one.

The 65Org32 has no 16-bit mode or 8-bit mode. A byte is 32 bits. You can still interface it to 8-bit peripherals, just as I also interfaced a 4-bit real-time clock to a 6502 in my earliest commercial design.

From Donbo:
Quote:
Frankly, despite fond memories a have of the 6502, I don't see the point of having a 32-bit version of it, regardless of how compatible it would be with the original 6502. The 6502 was great for its time but times have changed.

The main thing for me is handling larger data sizes. Even in the matter of a 32-bit loop counter for a looping structure, the 6502 takes something like 30 instructions (plus NEXT), whereas a 32-bit one would be as simple as DEX, BNE. I have code for the 32-bit DO LOOP and associated Forth words at http://wilsonminesco.com/Forth/32DOLOOP.FTH , and a code example showing the big difference in code length between 6502 and 65816 for handling 16-bit quantities at viewtopic.php?f=9&t=1505&p=9705#p9705.

Quote:
Of course, that's available now, since there won't _be_ any future versions of the 6502, except for amateur efforts.

Probably true, no commercial ones; but the 8-bit 6502 is being produced in absolutely huge quantities today-- hundreds of millions of units per year. You probably have quite a few you didn't know about, under the hood of your car, behind the dashboard, in personal electronics, appliances, etc..

Quote:
Is there a reason to stop at 32 bits and not go al the way to 64 bits?

A big interest of mine is Forth, where the 16 bits I've been using for years is usually enough but sometimes not quite. 32 will definitely be enough. I'm using it for controlling stuff on the workbench, with little human I/O. If you do a fast fourier transform (FFT) for example with 16-bit cells, you're limited to about 6 bits of input data precision on a 2,048-point complex FFT before you start having overflow problems. (An FFT lets you for example analyze noise and vibration, whether it's acoustical, machinery, etc..) The sizes of D/A and A/D converters and other things come into the picture, and 32 is definitely more than enough for the registers, for common uses. 4GW (16GB) of memory is even enough to hold a feature-length movie, something we won't be doing with this class of processor. The double-precision intermediate results however could be 64-bit, as when you take a 32-bit number and multiply it by another 32-bit number (getting a 64-bit number in a register) and divide the result by yet another 32-bit number to get a 32-bit result and possibly a 32-bit remainder.

Off to bed (again).

_________________
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?


Top
 Profile  
Reply with quote  
PostPosted: Mon Feb 24, 2014 8:05 pm 
Offline

Joined: Mon Feb 24, 2014 7:28 am
Posts: 2
Incrementing memory does not pipeline, you have to crack that instruction into separate load, and add and save. Doing this can cause a seven cycle bubble or worse and means you have to track two instructions in case of interrupt, etc. Two separate instructions is faster and simpler to deal with on a pipelined processor.
Indirection off the stack has the same problem but worse, two dependent memory reads.
Indirection through the Direct page is not indirection if the direct page is registers.
The 6502 becomes almost a RISC chip.


Top
 Profile  
Reply with quote  
PostPosted: Tue Feb 25, 2014 3:32 am 
Offline

Joined: Sun Jul 28, 2013 12:59 am
Posts: 235
GARTHWILSON wrote:
If you do a fast fourier transform (FFT) for example with 16-bit cells, you're limited to about 6 bits of input data precision on a 2,048-point complex FFT before you start having overflow problems. (An FFT lets you for example analyze noise and vibration, whether it's acoustical, machinery, etc..) The sizes of D/A and A/D converters and other things come into the picture, and 32 is definitely more than enough for the registers, for common uses.


Somewhat off-topic, and something that I might be able to figure out myself with a bit of work, but if I was entirely uninterested in the phase aspect of a complex FFT (only interested in the... I want to say "magnitude", but I'm sufficiently inexperienced that I'm not at all sure that that's the right term), how much input data precision would I get with 16-bit cells?


Top
 Profile  
Reply with quote  
PostPosted: Tue Feb 25, 2014 4:04 am 
Offline
User avatar

Joined: Sun Jun 30, 2013 10:26 pm
Posts: 1949
Location: Sacramento, CA, USA
ggtgp wrote:
Incrementing memory does not pipeline, you have to crack that instruction into separate load, and add and save. Doing this can cause a seven cycle bubble or worse and means you have to track two instructions in case of interrupt, etc. Two separate instructions is faster and simpler to deal with on a pipelined processor.
Indirection off the stack has the same problem but worse, two dependent memory reads.
Indirection through the Direct page is not indirection if the direct page is registers.
The 6502 becomes almost a RISC chip.


I think that I understand, but I would think that read, modify, and write would almost have to be three separate instructions, right? In the interest of code density, wouldn't it be possible for a non-simplistic core like the one you propose to expand the RMW into three internal operations? I'm not sure, but I think that I read somewhere that this technique is used with some success on newer x86 cores ... they are hauling @$$ on old 'legacy' CISC code (well, I guess that most x86 code has a certain 'legacy' ingredient).

Your concepts are at the ragged edge of my limited understanding, but I'm very interested in learning from you, and I'm sure that I'm not alone in that sentiment. I avoided most of those issues when working on my design, with the realization that the 65m32's performance would wind up tightly bound to memory access performance, which has kept it simple, just like its inspiration.

Of course, the 65m32 has a form of 'zero-page', from word addresses $ffff0001 to $0000ffff, and that might be small enough to keep on-chip (very fast access) with the core, but that's pure conjecture on my part. I know very little about the subjects of pipe-lining and caching, and I'm starting to feel like an old dog trying to learn a new trick!

Thanks,

Mike

P.S. The thought occurred to me that my one-word memory access instructions could be cached, with 128KW per index-capable register, for a total of one Mega-Word (assuming no overlap). The 'magic-value' extended-address instructions would be penalized, but that would be fine with me. Is 1MW still too big to fit on a single piece of silicon, along with a few thousand transistors to implement the core?


Top
 Profile  
Reply with quote  
PostPosted: Tue Feb 25, 2014 5:16 am 
Offline
User avatar

Joined: Fri Aug 30, 2002 1:09 am
Posts: 8543
Location: Southern California
nyef wrote:
GARTHWILSON wrote:
If you do a fast fourier transform (FFT) for example with 16-bit cells, you're limited to about 6 bits of input data precision on a 2,048-point complex FFT before you start having overflow problems. (An FFT lets you for example analyze noise and vibration, whether it's acoustical, machinery, etc..) The sizes of D/A and A/D converters and other things come into the picture, and 32 is definitely more than enough for the registers, for common uses.

Somewhat off-topic, and something that I might be able to figure out myself with a bit of work, but if I was entirely uninterested in the phase aspect of a complex FFT (only interested in the... I want to say "magnitude", but I'm sufficiently inexperienced that I'm not at all sure that that's the right term), how much input data precision would I get with 16-bit cells?

That's at the outer edge of my math abilities. I figured out and derived the discrete Fourier transform myself but never really understood the Cooley–Tukey fast algorithm with its butterflies. For that, I adapted a Forth program someone else had written, without fully understanding it. Wikipedia says, "it is possible to express an even-length real-input DFT as a complex DFT of half the length (whose real and imaginary parts are the even/odd elements of the original real data), followed by O(N) post-processing operations." I have not tried it. The DFT I figured out takes in only real data; but both methods produce complex (rectangular) outputs for each frequency, meaning you would still have to convert to magnitude, even if you discard the phase info. I don't think there's any way to detect how much energy is in each bin without using the sin & cos and getting a complex output, unless you knew the phase of each frequency ahead of time and were only looking for the amplitude (but in that case you probably wouldn't need Mr. Fourier).

Doing it in Forth on a 65c02 at 5MHz in 16-bit scaled-integer takes about five seconds for a 2,048-point FFT before re-ordering the outputs. Doing an FFT of only half that many points (ie, 1,024) in floating-point in GWBASIC on the original IBM PC took about nine minutes IIRC.

_________________
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?


Top
 Profile  
Reply with quote  
PostPosted: Tue Feb 25, 2014 5:27 am 
Offline
User avatar

Joined: Sun Jun 30, 2013 10:26 pm
Posts: 1949
Location: Sacramento, CA, USA
GARTHWILSON wrote:
... The DFT I figured out takes in only real data; but both methods produce complex (rectangular) outputs for each frequency, meaning you would still have to convert to magnitude, even if you discard the phase info.


So, am I correct in assuming that nyef's question could be boiled down to

"How many bits of precision can be relied upon for c in a² + b² = c² if a and b are 16-bits each?"

Is the answer 16, or maybe 16/√2? I'm not sure ... it's been about 25 years since I did any error propagation analyses ...

Mike

t
h
r
e
a
d
drift .... ;-)


Top
 Profile  
Reply with quote  
PostPosted: Tue Feb 25, 2014 6:35 am 
Offline
User avatar

Joined: Fri Aug 30, 2002 1:09 am
Posts: 8543
Location: Southern California
What limits the precision is the overflow errors that can result from the summation. (I tried to construct equation images to post from codecogs.com but couldn't figure out how to make it show here. It's really for putting in your html.) 2,048 is 2^11, and 6 more bits makes 17; but assuming there are lots of frequencies in the input and there's a normal crest factor, you won't be overflowing things with a 6-bit input but you will with 8 and possibly even with 7. Our resident algorithms expert Bruce Clark can probably figure it out. I have too much work to do right now to take the time to confirm that I can't figure it out. :lol:

_________________
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?


Top
 Profile  
Reply with quote  
 Post subject: FFT overflow trouble
PostPosted: Tue Feb 25, 2014 6:43 am 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10985
Location: England
I'm no expert, but this link suggests using a scale factor to avoid overflow and get a good tradeoff for precision:
http://www.mathworks.co.uk/products/dem ... mo.html#10
Cheers
Ed


Top
 Profile  
Reply with quote  
 Post subject: Re: FFT overflow trouble
PostPosted: Tue Feb 25, 2014 7:27 am 
Offline
User avatar

Joined: Fri Aug 30, 2002 1:09 am
Posts: 8543
Location: Southern California
BigEd wrote:
I'm no expert, but this link suggests using a scale factor to avoid overflow and get a good tradeoff for precision:
http://www.mathworks.co.uk/products/dem ... mo.html#10

Good link (although there's a lot to read on that page and I didn't read it all). I suspect that it's not much of a tradeoff in the sense of compromise for output precision, since what you lose in input precision of individual samples is offset by the greater number of input samples, and that with multiple frequencies involved, there's effectively a dithering going on. I'd have to experiment.

BTW, a similar thing can happen with floating-point too, depending on the precision used. Exponent adjustment prevents overflow; but for a given number of bits of mantissa, if you add enough input numbers together, you will reach intermediate results high enough that many of the following input numbers will be too small to have any effect anymore, even though their full precision can be represented while the number is normalized. So in a system with a mantissa of 23 bits plus sign, adding .001 to 9,000 will have no efect. The answer will be 9,000. In that case, 32-bit scaled-integer has a big advantage in precision (not to mention performance, if there's no floating-point coprocessor).

_________________
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?


Top
 Profile  
Reply with quote  
 Post subject: Re: FFT overflow trouble
PostPosted: Thu Mar 06, 2014 8:42 am 
Offline

Joined: Wed Oct 22, 2003 4:07 am
Posts: 51
Location: Norway
GARTHWILSON wrote:
BTW, a similar thing can happen with floating-point too, depending on the precision used. Exponent adjustment prevents overflow; but for a given number of bits of mantissa, if you add enough input numbers together, you will reach intermediate results high enough that many of the following input numbers will be too small to have any effect anymore, even though their full precision can be represented while the number is normalized. So in a system with a mantissa of 23 bits plus sign, adding .001 to 9,000 will have no efect. The answer will be 9,000. In that case, 32-bit scaled-integer has a big advantage in precision (not to mention performance, if there's no floating-point coprocessor).

Maybe a bit of topic, but adding .001 to 9000 will actually have an effect. Floating point numbers are almost always normalized, giving 1 extra bit of precision, so the result would be 9000.0009765625, or 9000 + 1/1024

Even if you used an unnormalized format the addition would have an effect, with standard rounding the result would be 9000.001953125, since that is closer to the true answer than 9000 is.

Even in articles that are supposed to explain floating point numbers the author usually forgets about this extra bit of precision from normalization, even after having mentioned it earlier in their text!


Top
 Profile  
Reply with quote  
PostPosted: Thu Mar 06, 2014 9:03 am 
Offline
User avatar

Joined: Fri Aug 30, 2002 1:09 am
Posts: 8543
Location: Southern California
Yes, I forgot about the times they imply a high "1" bit that is not expressed. (I always wondered how they represent a 0 when they do that.) To do an addition though, the smaller number has to be right-shifted, ie, non-normalized, to line up bits properly for the addition; and to line them up, the smaller number has to be shifted so far to the right that it goes all the way to the bit bucket. So if there's an implied high bit, I should have said 18,000, not 9,000, and then you have a situation where adding the .001 will have no effect, no matter how many times you do it, since there aren't enough bits to resolve the difference between 18,000.000 and 18,000.001.

For my first commercial 6502 project, I wrote the "4-banger" floating-point routines for 7 decimal digits in assembly. That was before I was introduced to the possibilities and the greater efficiency of scaled-integer which I now use for everything except of course my calculator.

_________________
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?


Top
 Profile  
Reply with quote  
PostPosted: Thu Mar 06, 2014 9:21 am 
Offline

Joined: Wed Oct 22, 2003 4:07 am
Posts: 51
Location: Norway
GARTHWILSON wrote:
Yes, I forgot about the times they imply a high "1" bit that is not expressed. To do an addition though, the smaller number has to be right-shifted, ie, non-normalized, to line up bits properly for the addition; and to line them up, the smaller number has to be shifted all the way to the bit bucket. So if there's an implied high bit, I should have said 18,000, not 9,000, and then you have a situation where adding the .001 will have no effect, no matter how many times you do it.

But if you follow the ieee standard, then you are not supposed to throw away any bits before the addition, you should use as many bits as needed to get accurate rounding, and then do the rounding. So adding 18000 and .001 would give you 18000.001953125 after rounding. If you tried to add .001 to that number one more time then it would have no effect because it would get rounded down.

Adding .001 to 36000 would however have no effect at all.

Really sorry about the nitpicking, but there is so much inaccurate information about floating point numbers that I have problems holding my tongue even when the inaccuracy doesn't really matter at all.


Top
 Profile  
Reply with quote  
PostPosted: Thu Mar 06, 2014 9:36 am 
Offline
User avatar

Joined: Fri Aug 30, 2002 1:09 am
Posts: 8543
Location: Southern California
I don't mind the nitpicking. You also brought more attention to the binary-to-decimal conversion which I did not do in the floating-point routines I did, since in that case I did them directly in decimal, something I also don't do anymore. (That was approximately 1987.) In fact, I don't have any real interest in implementing the decimal mode in an all-32-bit 65816.

My point is that a 32-bit integer has more precision than a 32-bit floating-point number which has to devote some of those 32 bits to the exponent. In floating-point, the scale factor (the exponent) is handled at run time, whereas in scaled-integer, it is handled by the programmer. It's a little more work to program, but relieves the computer of some overhead so it can perform better especially if there's no hardware floating-point unit.

Edit: I should also add that in scaled-integer, the scale factor is not limited to powers of 2 (or 10 or anything else) like it is in floating-point. (It's not just fixed-point, as fixed-point is a special, limited case of the broader scaled-integer.) The scale factor can be anything, and that may further increase the precision advantage of scaled-integer by a small amount, depending on the required range. In trig functions for example, the 360° circle is typically represented by however many bits you have in a standard cell, automatically accommodating negative numbers and preventing overflows. With 16 bits, each increment is .00549316° or 19.7754 arc-seconds, or .000095874 radians. For 32 bits, it would be 1.4629180793 nanoradians or 83.819031715 nanodegrees. Conversion to human output (if it needed) is not done until all the calculations are done.

_________________
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 44 posts ]  Go to page Previous  1, 2, 3

All times are UTC


Who is online

Users browsing this forum: No registered users and 2 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: