Page 1 of 2

BBC floating point formats

Posted: Tue Sep 24, 2024 4:30 pm
by teamtempest
So I'm aware of this page, previously pointed out by Ed: https://beebwiki.mdfs.net/Floating_Point_number_format

But it's not completely clear to me. This portion:
  • So, for example on Acorn BASICs:

    4 is exponent &83, sign 0, mantissa &80000000, stored as &83,&00000000
    -8 is exponent &84, sign 1, mantissa &80000000, stored as &84,&00000000
    12 is exponent &84, sign 0, mantissa &C0000000, stored as &84,&40000000
    -0.5 is exponent &80, sign 1, mantissa &80000000, stored as &80,&80000000

    Zero is a special case and is stored as five zero bytes. Some versions of BBC BASIC extend this and use a zero exponent to indicate that the real actually holds an integer value. For example:

    &00, &00000000 is 0
    &00, &00000080 is 128 (where implemented)
    &00, &FFFFFFFE is -2 (where implemented)
Seems to be saying that Acorn BASICs store floating point numbers like Microsoft's 40-bit floating point format used in Commodore and Apple II computers (namely an excess-128 exponent at the lowest memory address, followed by four byte signed mantissa with its most significant byte first).

It also appears to say that some versions have representations that allow the same integer to be represented in floating point in two different ways (which if true must lead to some interesting implementation details in the manipulation routines).

But it also says this:
  • 6502

    6502 BASIC stores reals in memory high-to-low as:

    address+0 address+4
    mantissa.hi, mantissa.middle, mantissa.middle, mantissa.lo, exponent

    The only 32-bit integer that 6502 BASIC allows to be stored in a real is zero. 6502 BASIC uses excess-&80 for the exponent, so &80 represents an exponent of zero.
    Non-6502 BASICs

    All other versions of BBC BASIC store reals in memory low-to-high as:

    address+0 address+4
    mantissa.lo, mantissa.middle, mantissa.middle, mantissa.hi, exponent
Which says that no BBC BASIC uses the Microsoft 40-bit floating point format at all, although they are both (or all) 40-bit formats.

Also that the 6502 BASICs represent zero not with an exponent of zero (again what Microsoft BASICs do) but with an exponent of $80 (which I suppose can work if zero is always checked for before doing any work on a floating point number - it's rather difficult to normalize a number which has no one bits, so that would have to come first).

My questions basically come down to, what is the real memory layout of floating point numbers in BBC BASICs, and second, are there any particular names for these layouts?

Re: BBC floating point formats

Posted: Thu Sep 26, 2024 5:28 pm
by jgharston
teamtempest wrote:
So I'm aware of this page, previously pointed out by Ed: https://beebwiki.mdfs.net/Floating_Point_number_format
What is is that you don't understand? I wrote that page, mostly based on (what I thought was) the very clear explanation in the ZX Spectrum Basic manual. It's bog standard 5-byte floating point. Just... normal. Nothing special. Used in loads and loads of implementations.

exponent != 0 -> value is float: float = mantissa * 2^(exponent-bias)
exponent = 0 -> value is integer: integer = mantissa, with 6502 only supporting mantissa=0

Re: BBC floating point formats

Posted: Fri Sep 27, 2024 4:13 pm
by teamtempest
A reply I wasn't expecting, but welcome nonetheless.

I wouldn't say "bog-standard". Prior to the adoption of IEEE-754, I'm not aware of any "standard" floating point formats. Commonly used formats, perhaps, but not standard ones.

There appear to be three different formats displayed in the bits quoted above:

Code: Select all

   Addr+0   Addr+1  Addr+2  Addr+3  Addr+4

   Exp      Man(M)  Man     Man     Man(L)         Microsoft BASICs

   Man(M)   Man     Man     Man(L)  Exp            BBC 6502 BASICs

   Man(L)   Man     Man     Man(H)  Exp            BBC non-6502 BASICs
 
...where Man(H) stands for Most Significant Byte of the mantissa and Man(L) for Least Significant Byte.

In the above section, when written as
Quote:
4 is exponent &83, sign 0, mantissa &80000000, stored as &83,&00000000
this appears to match the Microsoft BASIC format. Although that can't actually be, as the "mantissas" differ from "stored" in three out of four cases. In Microsoft BASICs:
  • 1000 83 00 00 00 .float 4, -8, 12, -0.5
    1004 00
    1005 84 80 00 00
    1009 00
    100A 84 40 00 00
    100E 00
    100F 80 80 00 00
    1013 00
...which, as they appear on the page mentioned, agrees with "stored" in three out of four cases, "-8" being the exception. "Mantissa" agrees in two out of four cases, as "4" and "12" are exceptions.

Regarding the point about two different formats for the same number, again in Microsoft BASICs:
  • 1014 00 00 00 00 .float 0, 128, -2
    1018 00
    1019 88 00 00 00
    101D 00
    101E 82 80 00 00
    1022 00
Zero certainly agrees, but "$88,00,00,00,00" and "$00,00,00,00,80" both appear to be legal representations of "128" (wherever the exponent appears - on the page it appears at the lowest address, but it doesn't really matter much) and "$82,80,00,00,00" and "$00,FF,FF,FF,FE" seem to be two legal representations of "-2". I have no real objections to that, just that it must be accounted for somewhere, bringing them to one common format before trying to manipulate them.

Unless there's some advantage to this that I can't immediately see?

Re: BBC floating point formats

Posted: Sat Sep 28, 2024 1:00 am
by pjdennis
teamtempest wrote:
Quote:
4 is exponent &83, sign 0, mantissa &80000000, stored as &83,&00000000
this appears to match the Microsoft BASIC format. Although that can't actually be, as the "mantissas" differ from "stored" in three out of four cases. In Microsoft BASICs:
  • 1000 83 00 00 00 .float 4, -8, 12, -0.5
    1004 00
    1005 84 80 00 00
    1009 00
    100A 84 40 00 00
    100E 00
    100F 80 80 00 00
    1013 00
...which, as they appear on the page mentioned, agrees with "stored" in three out of four cases, "-8" being the exception. "Mantissa" agrees in two out of four cases, as "4" and "12" are exceptions.
The '-8' example on the beebwiki page was missing the '1' in the top bit of the stored mantissa value to indicate a negative value. After that correction, all the stored values from the examples on the wiki page match what you posted in your listing.

Code: Select all

   4 is exponent &83, sign 0, mantissa &80000000, stored as &83,&00000000
  -8 is exponent &84, sign 1, mantissa &80000000, stored as &84,&80000000
  12 is exponent &84, sign 0, mantissa &C0000000, stored as &84,&40000000
-0.5 is exponent &80, sign 1, mantissa &80000000, stored as &80,&80000000
teamtempest wrote:
[...] "$88,00,00,00,00" and "$00,00,00,00,80" both appear to be legal representations of "128" (wherever the exponent appears - on the page it appears at the lowest address, but it doesn't really matter much) and "$82,80,00,00,00" and "$00,FF,FF,FF,FE" seem to be two legal representations of "-2". I have no real objections to that, just that it must be accounted for somewhere, bringing them to one common format before trying to manipulate them.

Unless there's some advantage to this that I can't immediately see?
We can guess at the motivation for the alternate format, probably an optimization for working with small integer values which would be much faster to perform operations on if all the involved values are in that format.

Re: BBC floating point formats

Posted: Sun Sep 29, 2024 6:51 pm
by teamtempest
Quote:
4 is exponent &83, sign 0, mantissa &80000000, stored as &83,&00000000
-8 is exponent &84, sign 1, mantissa &80000000, stored as &84,&00000000
12 is exponent &84, sign 0, mantissa &C0000000, stored as &84,&40000000
-0.5 is exponent &80, sign 1, mantissa &80000000, stored as &80,&80000000
I see I've allowed myself to be led astray. Perhaps I might have caught on quicker if this had been phrased as something like this:

Code: Select all

 Value      Exponent     Sign     Normalized Mantissa      Stored with Sign Replacement As

 4            &83            0            &80000000                     &83, &00000000
-8            &84            1            &80000000                     &84, &80000000
12            &84            0            &C0000000                     &84, &40000000
-0.5          &80            1            &80000000                     &80, &80000000
Even though the text explicitly says this is an Acorn BASIC format, it does not match either of the BBC 6502 or BBC non-6502 formats mentioned further down. It still looks to me more like the storage format of Microsoft BASICs.

It seems to me there are four possible formats for a one byte excess-127 exponent and a four byte mantissa:

Code: Select all

   Addr+0   Addr+1  Addr+2  Addr+3  Addr+4

   Exp      Man(L)  Man     Man     Man(M)         not known to be used
 
   Exp      Man(M)  Man     Man     Man(L)         Microsoft BASICs (and Acorn BASICs ?)

   Man(M)   Man     Man     Man(L)  Exp            BBC 6502 BASICs

   Man(L)   Man     Man     Man(H)  Exp            BBC non-6502 BASICs
If they don't have common names, it seems to me they could be systematized by placement of parts and mantissa ordering. Something like "EM40-le" might mean "exponent followed by mantissa, 40 total bits, least significant byte of mantissa first". So Microsoft BASICs would be "EM40-be", BBC 6502 as "ME40-be", and BBC non-6502 as "ME40-le". Not really obvious at first glance, but possible.

Less systematic but possibly more understandable would be "MS40", "BBC40-be" and "BBC40-le" for the three formats actually known to be used.

Anyway, it does seem to me that interpreting the mantissa as an integer when the exponent is zero might be kind of handy in any FOR loop that counts by integers (most of them) but still be flexible enough to handle non-integer values as well. Do FOR loops run faster with integer increments than non-integer increments?

Re: BBC floating point formats

Posted: Fri Oct 04, 2024 4:39 pm
by teamtempest
There is this http://8bs.com/basic/basic4-bf24.htm, which seems to be further evidence that BBC BASICs floating points use the same format as Microsoft BASICs. The prompts I've given to Google haven't yet turned up any concrete evidence of any other format.

I did find that the difference between storing a value as in floating point format or as a four-byte integer with a zero exponent depends on whether or not there is a decimal point in the number that appears in the program text. If yes, then floating point. If not, then integer, although I suppose it's still easy enough to write an integer value that won't fit into 32 bits. The overall speedup to yes, FOR loops, seems to make the effort to distinguish between them worth doing.

Re: BBC floating point formats

Posted: Fri Oct 04, 2024 5:47 pm
by BigEd
Should be fairly easy to test with programs...

BBC Basic here
PET Basic here

In both cases they print out this (except BBC Basic has 163 as the final byte)

Code: Select all

  130   73   15   218   162
  130   9   15   218   162
  131   9   15   218   162
BBC Basic

Code: Select all

100 A=7
110 DEF FNP(X)=?X
115 T=TOP
120 A=4*ATN(1)
130 GOSUB 190
140 A=A-1
150 GOSUB 190
160 A=A+A
170 GOSUB 190
180 END
190 PRINT " ";FNP(T+3);
200 PRINT " ";FNP(T+4);
210 PRINT " ";FNP(T+5);
220 PRINT " ";FNP(T+6);
230 PRINT " ";FNP(T+7)
240 RETURN
PET Basic (editor here)

Code: Select all

120 A=4*ATN(1)
130 GOSUB 190
140 A=A-1
150 GOSUB 190
160 A=A+A
170 GOSUB 190
180 END
190 PRINT " ";PEEK(PEEK(42)+256*PEEK(43)+2);
200 PRINT " ";PEEK(PEEK(42)+256*PEEK(43)+3);
210 PRINT " ";PEEK(PEEK(42)+256*PEEK(43)+4);
220 PRINT " ";PEEK(PEEK(42)+256*PEEK(43)+5);
230 PRINT " ";PEEK(PEEK(42)+256*PEEK(43)+6)
240 RETURN

Re: BBC floating point formats

Posted: Sat Oct 05, 2024 2:27 pm
by teamtempest
Thank you, Ed.

The slight difference between the two approximations is probably due to the way arctan() is calculated. Both machines use a polynomial series, but not the same series and not the same way.

Since both formats are really the same format, are there any other differences that need to be taken into account?

My first instinct would be no, but this https://www.c64-wiki.com/wiki/Floating_point_arithmetic and this http://www.riscos.com/support/developer ... icrep.html disagree with each other regarding the limits of what can be represented.

For the C64: min about 2.9E-39, max about 1.7E+38

For the BBC: min about 1.5E-39, max about 1.7 E+38

Then there's my own thoughts on this representation, which disagrees with both:

Code: Select all

min  = 2**-127,               about 5.877471754111e-39
max = (2 - 2**-31) * 2**127,  about 3.402823668417e+38
I'll have to play around with all of these numbers and see what they look like in floating point.

Re: BBC floating point formats

Posted: Sat Oct 05, 2024 3:15 pm
by BigEd
The 1.7E38 checks out, according to

Code: Select all

a=65535
REPEAT
PRINTa
a=a+a
UNTIL0
Your idea is that -127 to 127 is the binary exponent range, but evidently not. That's only 255 values, which means one would be missing.

As for the smallest value, this program in jsbeeb (BBC Basic) suggests the 2.9E-39 is the right figure, which is different from the wiki you link.

Code: Select all

a=1
REPEAT
PRINTa
a=a/2
UNTILa=0
Then again, the wiki isn't talking about the 6502 versions from 2 to 4, but versions 5 and 6, which are I think ARM versions.

Re: BBC floating point formats

Posted: Sat Oct 05, 2024 3:38 pm
by teamtempest
I think you're right. My original limits are off by one exponent. They should be:

Code: Select all

max = (2 - 2**-31) * 2**126    # about 1.701411834e+38
min = 2**-128			        # about 2.938735877e-39
These input values produce these floating point values:
  • max = FF 7F FF FF FF
    min = 01 00 00 00 00
which would be the full and correct positive range for this format. Well, except for zero, anyway.

Re: BBC floating point formats

Posted: Tue Oct 08, 2024 3:19 pm
by teamtempest
...and of course if give it I give it any thought, a zero exponent and a four-byte integer literal is not some special representation of a BBC BASIC integer, it is in fact the native representation. Ie., all integers are represented that way, and distinguishing between floating point and integer representations of a number is no more difficult or cumbersome than in any system that uses both. Plus both are the same size internally, making them simpler to store and scan, and have a range of about minus 2 billion to plus 2 billion, making exact integer calculations in this range much easier than in systems which use only two-byte integers.

Re: BBC floating point formats

Posted: Tue Oct 08, 2024 3:21 pm
by BigEd
(I think I'm right in saying that BBC Basic doesn't do this, though - the versions I'm familiar with will use 4 byte integers, and will always normalise floats.)

Re: BBC floating point formats

Posted: Wed Oct 09, 2024 2:45 pm
by teamtempest
You're far more familiar with this than I am, but doesn't the page I linked to at the very top of this discussion say that "some" BBC BASICs use a zero exponent byte to flag that the remaining four bytes hold a literal integer value? In Microsoft BASICs I'm familiar with, a zero value in the exponent byte flags that the whole number should be considered zero, regardless of the value of any other byte. The BBC convention (if it exists) strikes me as a clever way to get a lot of use out of what Microsoft would throw away or otherwise ignore, and on something that has to be accounted for somewhere in any case. The only advantage of the Microsoft method that I can see off-hand is that it's faster than the BBC method to tell whether the value represented is zero (Microsoft checks only one byte, BBC would have check all five to be sure).

The author of that page has contributed here before. Perhaps he can shed some light on this.

Re: BBC floating point formats

Posted: Wed Oct 09, 2024 3:35 pm
by BigEd
You're quite right, the Beebwiki page does say that. I would wonder if that's an empirical thing - an observation - or a specified behaviour. And is it true only for RTR's Z80 Basics, I wonder. How about x86, or ns32k, or ARM, or the more recent portable version...

And if we think it works, how would we test it? Would need to test all arithmetic and logical functions. Or, perhaps, read the code!

It is a question though, and I should have qualified what I said, which really only applies to the 6502 versions.

Re: BBC floating point formats

Posted: Fri Oct 11, 2024 4:29 pm
by teamtempest
Well there's this: https://www.cl.cam.ac.uk/~jrh13/devnote ... tml#sec211, which discusses (section 2.11, The Floating Point Package) some of the finer points of using floating point on a Z88. The format is the one described as "BBC non-6502 BASICs" by that Beebwiki page. So that's one concrete example.

There's also this: https://github.com/jblang/bbcbasic-z80/ ... er/fpp.asm, the source code on GitHub of RT Russell's Z80 floating point package. Again using that "BBC non-6502 BASICs" format. Plus showing how to account for a zero value exponent flagging a four-byte integer value in the mantissa, in routines that can deal with both.