Floating point trouble
Floating point trouble
Lee,
I discovered this yesterday on v2.09. I tried it on my SBC-2, SBC-3, and Simulator. I also downloaded v2.10 and tried it on my simulator and the 6502 simulator.
If you enter PRINT 41^3<enter>, you get 68921.1 vs 68921
There are other numbers that give this error.
try FOR X=40 TO 50: PRINT X^3:NEXT
I also found it on some X^4 values.
I have not dug into this at all, or tried more number ranges.
Any thoughts?
Daryl
I discovered this yesterday on v2.09. I tried it on my SBC-2, SBC-3, and Simulator. I also downloaded v2.10 and tried it on my simulator and the 6502 simulator.
If you enter PRINT 41^3<enter>, you get 68921.1 vs 68921
There are other numbers that give this error.
try FOR X=40 TO 50: PRINT X^3:NEXT
I also found it on some X^4 values.
I have not dug into this at all, or tried more number ranges.
Any thoughts?
Daryl
Note that PRINT 41^3 is not the same as PRINT 41*41*41. EhBASIC calculates X^Y using the formula EXP(Y*LOG(X)). Since LOG is base e (~2.718), for integer X, mathematically, Ln X will be irrational, so LOG(X) (as calculated by EhBASIC) will be an approximation, not an exact value, and there are the usual FP caveats (e.g. round off error) that go along with that.
So if L=LOG(X), Y*L is also an approximation, and if P=Y*L, then EXP(P) is an approximation as well. So X^Y will be an approximation, that may or may not be identical to the mathematically exact value after rounding. You can check each step of the calculation, i.e. is PRINT LOG(41) correct to however many digits it displays, is 3 * the previous result correct to the appropriate number of decimal places, etc. If so, the FP routines are probably ok, you're just running into the limitations of FP.
It's possible to modify the X^Y routine to handle integer Y specially. It's not a major change, though its not a one or two liner either. The EXP(Y*LOG(X)) formula will still be necessary for things like PRINT 3^2.1
(For the same reason, 9^.5 is not the same as SQR(9), even though .5 can be represented exactly in EhBASIC. The SQR routine in EhBASIC will generate an exact value in that case. It all depends on how many special cases you want X^Y to handle.)
So if L=LOG(X), Y*L is also an approximation, and if P=Y*L, then EXP(P) is an approximation as well. So X^Y will be an approximation, that may or may not be identical to the mathematically exact value after rounding. You can check each step of the calculation, i.e. is PRINT LOG(41) correct to however many digits it displays, is 3 * the previous result correct to the appropriate number of decimal places, etc. If so, the FP routines are probably ok, you're just running into the limitations of FP.
It's possible to modify the X^Y routine to handle integer Y specially. It's not a major change, though its not a one or two liner either. The EXP(Y*LOG(X)) formula will still be necessary for things like PRINT 3^2.1
(For the same reason, 9^.5 is not the same as SQR(9), even though .5 can be represented exactly in EhBASIC. The SQR routine in EhBASIC will generate an exact value in that case. It all depends on how many special cases you want X^Y to handle.)
-
leeeeee
- In Memoriam
- Posts: 347
- Joined: 30 Aug 2002
- Location: UK
- Contact:
Yes I do know about this. It is, as dclxvi says, a feature of the way some functions are calculated and for some values with some functions, gives a result accurate to as few as 14.5 bits.
I did try rewriting a lot of the math functions to use CORDIC methods but the size of the code with tables was bigger, it was as slow or slower as the current routines and there wasn't a huge gain in accuracy.
I would stick to basic math operators as much as you can, this will give more accurate results and may even be quicker. I think up to the 4th or 5th power is faster doing N*N*N... than using N^x
One routine that I did make shorter, faster and more accurate is SQR() which is good for 23+ bits of accuracy and, if it is out, always returns a result that is less than the true result by less than 1 bit. I could have made it good for 23.5 bits with a +/- 0.5 bit maximum error but how it is is shorter, faster and I prefer to know that any error will always give a result smaller than the true answer.
EhBASIC 68K is a little better as I wrote that from scratch and used CORDIC methods but it still returns some results with as little as 18 bits of accuracy.
It's not a bug, it's an undocumented feature.
Lee.
I did try rewriting a lot of the math functions to use CORDIC methods but the size of the code with tables was bigger, it was as slow or slower as the current routines and there wasn't a huge gain in accuracy.
I would stick to basic math operators as much as you can, this will give more accurate results and may even be quicker. I think up to the 4th or 5th power is faster doing N*N*N... than using N^x
One routine that I did make shorter, faster and more accurate is SQR() which is good for 23+ bits of accuracy and, if it is out, always returns a result that is less than the true result by less than 1 bit. I could have made it good for 23.5 bits with a +/- 0.5 bit maximum error but how it is is shorter, faster and I prefer to know that any error will always give a result smaller than the true answer.
EhBASIC 68K is a little better as I wrote that from scratch and used CORDIC methods but it still returns some results with as little as 18 bits of accuracy.
It's not a bug, it's an undocumented feature.
Lee.
Purely for interest:
I tracked down a uk101 emulator, and the BASIC there gives the same result (it's a 1977 Microsoft BASIC, with four byte floats)
(Looks like BBC BASIC and Spectrum BASIC both use 5 byte floats, hopefully with transcendental algorithms to match.)
There's some speculation about Microsoft's effort here:
http://www.amigau.com/68K/dg/dg25.htm
(and an accuracy benchmark too)
Also there's some material on a double precision library here:
http://www.amigau.com/68K/dg/dg16.htm
- interesting to me but probably old news to Lee.
I tracked down a uk101 emulator, and the BASIC there gives the same result (it's a 1977 Microsoft BASIC, with four byte floats)
(Looks like BBC BASIC and Spectrum BASIC both use 5 byte floats, hopefully with transcendental algorithms to match.)
There's some speculation about Microsoft's effort here:
http://www.amigau.com/68K/dg/dg25.htm
(and an accuracy benchmark too)
Also there's some material on a double precision library here:
http://www.amigau.com/68K/dg/dg16.htm
- interesting to me but probably old news to Lee.
-
leeeeee
- In Memoriam
- Posts: 347
- Joined: 30 Aug 2002
- Location: UK
- Contact:
Yes I remember that but I had forgotten just how bad some of EhBASIC's functions are.
Running the accuracy benchmark with each function pair gives the following result ..
Working out the numbers into bits of accuracy gives about 14 bits for TAN(ATN(a)), which is pretty poor, and about 21 bits for EXP(LOG(a)), which is better than I remembered.
x^y is good for about 22 bits but only if x:y is close to 1. I seem to remember that it gets worse by LOG2(x:y) so 41^3 will only be good to about about 18 bits, which is what we see.
At least I improved the SQR() function by all of 2 bits. 8^)=
Lee.
Running the accuracy benchmark with each function pair gives the following result ..
Code: Select all
TAN(ATN(a)) / a - 1 6.17789E-5
EXP(LOG(a)) / a - 1 2.81708E-7
(a * a)^.5 / a - 1 2.22382E-7 How the MicroSoft code does SQR()
SQR(a * a) / a - 1 0 My SQR() routinex^y is good for about 22 bits but only if x:y is close to 1. I seem to remember that it gets worse by LOG2(x:y) so 41^3 will only be good to about about 18 bits, which is what we see.
At least I improved the SQR() function by all of 2 bits. 8^)=
Lee.
There's a nice article reversing the various versions of Microsoft's BASIC here: http://www.pagetable.com/?p=46
Lots of other interesting stuff there too - including sources for 7 versions, both 4byte and 5byte floats.
Lots of other interesting stuff there too - including sources for 7 versions, both 4byte and 5byte floats.
I haven't (quite) got an environment in which to run EhBASIC - I'm nearly there.
But for interest I ran the TAN(ATN(A)) benchmark on 3 environments in emulation:
(I see RTRussell's re-implementation of BBC BASIC differs from the 6502 original)
These all have 5-byte floats, according to
The benchmark was
But for interest I ran the TAN(ATN(A)) benchmark on 3 environments in emulation:
Code: Select all
Z88 BBC BASIC V3: 3.84825359E-7
C64 BASIC V2: 2.38180225E-07
BBC 1982 BASIC: 2.08041984E-7
These all have 5-byte floats, according to
Code: Select all
PRINT 2^33+11-2^33
12
Code: Select all
10 I=2500
20 FORA=1TOI
30 B= TAN(ATN(A))/A -1
40 Z=Z+B*B
50 NEXT
60 PRINT "R.M.S. error = "; SQR(Z/I)
- BigDumbDinosaur
- Posts: 9425
- Joined: 28 May 2009
- Location: Midwestern USA (JB Pritzker’s dystopia)
- Contact:
I wonder if ehbasic's math would be better served by using packed BCD. It wouldn't be as fast as excess-128 FP (which is what appears to be implemented in ehbasic) but it would be dead accurate. The absolute range isn't as great as FP, but unless you're an astronomer or Bill Gates calculating his net worth, the range probably wouldn't be critical.
Actually, floating point errors are a serious problem even for more mundane applications. For starters, floating point addition is not associative. Adding a very small number to a very large number yields round-off errors, extending up to and including the entire magnitude of the number! Consider adding 3.14159 plus 10e+30. The result will be 10e+30 in most floating point math implementations. Ouch!
This is a rather large problem because of some equations which have terms which don't get simplified for some reason (usually programmer oversight, but other reasons too). E.g., a+b-a will be horrifyingly in error if a=10e30 and b=pi.
Fixed-point math is the only general purpose solution to this problem, and that requires a fair bit of memory to express some of these larger figures. You can sometimes work around this by storing numbers logarithmically, or choosing more appropriate units of measurement (for a practical example, see http://www.colorforth.com/simulation.html . For Chuck Moore's philosophical basis for this, see http://www.colorforth.com/binding.html)
To express figures such as 1/3 precisely, you need to implement rational numbers, where each "number" is internally stored as a numerator/denominator pair. Each component of the pair could be an integer, fixed-, or floating-point representation as appropriate for your needs, though most use integer. The PC/GEOS operating system, for example, utilized rationals extensively in its GUI, on an 8088 no less, to implement scaling, skewing, rotations, and other kinds of transforms.
Most impressive.
This is a rather large problem because of some equations which have terms which don't get simplified for some reason (usually programmer oversight, but other reasons too). E.g., a+b-a will be horrifyingly in error if a=10e30 and b=pi.
Fixed-point math is the only general purpose solution to this problem, and that requires a fair bit of memory to express some of these larger figures. You can sometimes work around this by storing numbers logarithmically, or choosing more appropriate units of measurement (for a practical example, see http://www.colorforth.com/simulation.html . For Chuck Moore's philosophical basis for this, see http://www.colorforth.com/binding.html)
To express figures such as 1/3 precisely, you need to implement rational numbers, where each "number" is internally stored as a numerator/denominator pair. Each component of the pair could be an integer, fixed-, or floating-point representation as appropriate for your needs, though most use integer. The PC/GEOS operating system, for example, utilized rationals extensively in its GUI, on an 8088 no less, to implement scaling, skewing, rotations, and other kinds of transforms.
Most impressive.
- BigDumbDinosaur
- Posts: 9425
- Joined: 28 May 2009
- Location: Midwestern USA (JB Pritzker’s dystopia)
- Contact:
paulrsm wrote:
BCD does not eliminate floating point errors; it merely changes them to different ones. Both binary and BCD can represent 1/2 exactly, but neither can represent 1/3 exactly.
More to the point are operations that result in an irrational number, such as 1 divided by 3 (0.333333...). The operands themselves may have made it through ASCII-to-internal format conversion, but when subjected to division or various transcendental operations, produce an irrational result. That limitation would apply to any computation device without regard to the internal number format in use.
x86? We ain't got no x86. We don't NEED no stinking x86!