6502.org • Convert numbers to C64 floating point compact form? - Page 2

Page 2 of 3

Posted: Wed Jan 26, 2011 11:24 am

by paul_nicholls

@BigEd and dclxvi: thanks guys, I will have a look at the links

@everyone else: thanks all, I will see if I can come up with something that works for a number in string format to the c64 5 byte fp format...

cheers,
Paul

Posted: Wed Jan 26, 2011 6:53 pm

by BigEd

I had a thought about this. You're writing in a high level language, so you have floats and doubles already available, and these will already be in a binary form. So no accuracy is lost if we operate on numbers by doubling and halving, and so, without making any other assumptions about your programming environment, we can pick out the information we need.

Here's some awk, which you can read as pseudocode - it gives the right answer for BBC Basic and can surely be adjusted for C64. It doesn't attempt zero or negative numbers but that should be a simple adjustment.

Code: Select all

awk 'BEGIN
{
  x=4*atan2(1,1);  # pi
  m=0;     # we'll shift the mantissa in, eventually
  e=129;  # the exponent offset
  while(x>1){  # repeatedly halve if x is large
    e++;   # account for the exponent
    x/=2
  }
  while(x<1){  # repeatedly double if x is small
    e--;   # account for the exponent
    x*=2
  }
  x-=1;  # remove the leading 1 bit
  for(i=0;i<31;i++){ #extract 31 bits of mantissa by repeated doubling
    if(x>=1){
      m++;  # adjust the mantissa
      x-=1
    };
    x*=2;
    m*=2  # shift the mantissa to the left
  }
  # we're finished - x might have a small residue - we didn't try to round
  printf "%s,%02x,%04x\n",x,e,m
}'

Cheers, Ed

edit: initialise e to 129. BBC Basic's representation of pi is 82490fdaa2 and this awk prints out 0.130505,82,490fdaa2
edit: hmm, will always finish by doubling, so LSB will always be zero. oops.

Posted: Wed Jan 26, 2011 7:42 pm

by paul_nicholls

BigEd wrote:

I had a thought about this. You're writing in a high level language, so you have floats and doubles already available, and these will already be in a binary form. So no accuracy is lost if we operate on numbers by doubling and halving, and so, without making any other assumptions about your programming environment, we can pick out the information we need.

Here's some awk, which you can read as pseudocode - it gives the right answer for BBC Basic and can surely be adjusted for C64. It doesn't attempt zero or negative numbers but that should be a simple adjustment.

Code: Select all

awk 'BEGIN
{
  x=4*atan2(1,1);  # pi
  m=0;     # we'll shift the mantissa in, eventually
  e=129;  # the exponent offset
  while(x>1){  # repeatedly halve if x is large
    e++;   # account for the exponent
    x/=2
  }
  while(x<1){  # repeatedly double if x is small
    e--;   # account for the exponent
    x*=2
  }
  x-=1;  # remove the leading 1 bit
  for(i=0;i<31;i++){ #extract 31 bits of mantissa by repeated doubling
    if(x>=1){
      m++;  # adjust the mantissa
      x-=1
    };
    x*=2;
    m*=2  # shift the mantissa to the left
  }
  # we're finished - x might have a small residue - we didn't try to round
  printf "%s,%02x,%04x\n",x,e,m
}'

Cheers, Ed

edit: initialise e to 129. BBC Basic's representation of pi is 82490fdaa2 and this awk prints out 0.130505,82,490fdaa2
edit: hmm, will always finish by doubling, so LSB will always be zero. oops.

Wow! Thanks so much Ed

This looks easy for me to convert to Pascal to use in my compiler

cheers,
Paul

Posted: Wed Jan 26, 2011 7:47 pm

by BigEd

I think the final missing LSB (and the rounding) can be fixed up after the for loop with something like

Code: Select all

if(x>1)m++

for truncation or

Code: Select all

if(2*x>1)m++

for rounding.

I don't think my method would work well if you had less precision in your HLL than you want in your output format: using doubles should be OK though.

Finally, I note that the comparisons against '1' might need to be strict, or loose: a detail which I didn't think hard about!

Cheers
Ed

Posted: Wed Jan 26, 2011 8:05 pm

by BigEd

I'm going to indulge myself a bit, because the same points often come up when floating point is discussed: here's a straightforward guide, here's Goldberg's in-depth classic article and here's something straightforward from the python docs

(I see Oracle have rearranged all of Sun's content, so that classic article has become difficult to find. Boo.)

Posted: Wed Jan 26, 2011 9:53 pm

by paul_nicholls

BigEd wrote:

I think the final missing LSB (and the rounding) can be fixed up after the for loop with something like

Code: Select all

if(x>1)m++

for truncation or

Code: Select all

if(2*x>1)m++

for rounding.

I don't think my method would work well if you had less precision in your HLL than you want in your output format: using doubles should be OK though.

Finally, I note that the comparisons against '1' might need to be strict, or loose: a detail which I didn't think hard about!

Cheers
Ed

Well, in my HLL I have access to single/double/extended so I should be ok when doing the conversions

Thanks again mate, much appreciated!

cheers,
Paul

Posted: Wed Jan 26, 2011 11:17 pm

by paul_nicholls

@BigEd:

Well, I have converted your conversion code over to Pascal

Code: Select all

program C64_Math_Test;

{$APPTYPE CONSOLE}

uses
  SysUtils;

type
  TFloatFunc = (
    ffNone,
    ffRound,
    ffTrunc
  );

function atan2(y : extended ; x : extended) : extended ; assembler;
asm
  fld [y];
  fld [x];
  fpatan
end;

procedure FloatToC64Float(var x: Extended; var e,m: Integer; FloatFunc: TFloatFunc);
var
  i: Integer;
begin
  m := 0;             // we'll shift the mantissa in, eventually
  e := 129;           // the exponent offset
  while (x > 1) do
  begin               // repeatedly halve if x is large
    e := e + 1;              // account for the exponent
    x := x / 2;
  end;

  while (x < 1) do
  begin               // repeatedly double if x is small
    e := e - 1;       // account for the exponent
    x := x * 2;
  end;

  x := x - 1;         // remove the leading 1 bit
  for i := 0 to 30 do
  begin               //extract 31 bits of mantissa by repeated doubling
    if (x >= 1) then
    begin
      m := m + 1;     // adjust the mantissa
      x := x - 1;
    end;
    x := x * 2;
    m := m * 2;       // shift the mantissa to the left
  end;
  // we're finished - x might have a small residue - we didn't try to round

  case FloatFunc of
    ffRound : if (x > 1)   then m := m + 1;
    ffTrunc : if (2*x > 1) then m := m + 1;
  else
  end;
end;

var
  x: Extended;
  e,m: Integer;
begin
  x := 4*atan2(1,1);  // pi

  FloatToC64Float(x,e,m,ffNone);
  WriteLn(LowerCase(Format('%f,%02x,%04x',[x,e,m])));
  ReadLn;
end.

And I get almost the same answer as you, which is very good I guess

Code: Select all

0.13,82,490fdaa2

The only difference is in how much is left in x...

I should now try outputting some number converted using this routine as hex bytes to insert into some 6510 assembly code to see if it works using the c64 floating point routines

Wish me luck!

PS. if/when I get something going I might post my little compiler so you all can have a play

cheers,
Paul

Re: Convert numbers to C64 floating point compact form?

Posted: Wed Jan 26, 2011 11:57 pm

by BigDumbDinosaur

Thowllly wrote:

9 significant digits, not 10 or 7, BDD might be thinking of single precision floating point (32bit fp) which has 7 (decimal) digit precision.

Yeah, I was dozing off at the wheel when I posted that. Damned chemo...

Re: Convert numbers to C64 floating point compact form?

Posted: Thu Jan 27, 2011 12:00 am

by paul_nicholls

BigDumbDinosaur wrote:

Thowllly wrote:

9 significant digits, not 10 or 7, BDD might be thinking of single precision floating point (32bit fp) which has 7 (decimal) digit precision.

Yeah, I was dozing off at the wheel when I posted that. Damned chemo...

No need to apologise

Chemo? bummer...I hope you get well soon

cheers,
Paul

Posted: Wed Jun 08, 2011 2:09 am

by paul_nicholls

Hi all,
after a large hiatus, I am going to work on my C64 compiler again, and I think I have now come up with an answer to my problem...

I was googling again, and found this page:

http://www.softwolves.com/arkiv/cbm-hackers/7/7617.html

Somehow I had missed it previously (D'OH!)

I have translated the C code to Pascal, and I am going to try it out in my compiler:

Code: Select all

procedure FloatToC64Float(num: Double);
// converted from original code found here:
// http://www.softwolves.com/arkiv/cbm-hackers/7/7617.html
//
var
  i,sign,bit,byte_val,flag,count,ei: Integer;
  digit: array[0..4] of Byte;
  a: Double;
begin
  if Abs(num) < 0.000001 then WriteLn('Oh come on.  Zero! 00 00 00 00 00');

  WriteLn(Format('Original number = %.10f',[num]));
  if (num < 0) then
  begin
    sign := 128;
    num  := -num;
  end
  else
    sign := 0;

  a := 1;
  for i := 0 to 126 - 1 do		// a = 2^126
  begin
      a := a * 2;
  end;

  byte_val := 0;
  flag     := 0;
  count    := 0;
  ei       := 0;

  i := 126;
  while  (i >= -128) and (byte_val < 4) do
  begin
    if num >= a then
    begin
      bit := 1;
      if flag = 0 then
        ei := i;

      flag := 1;
    end
    else
      bit := 0;

    if flag = 1 then
    begin
      digit[byte_val] := digit[byte_val] * 2 + bit;
      Inc(count);
      if count > 7 then
      begin
        count := 0;
        Inc(byte_val);
      end;
      num := num - a * bit;
    end;
    a := a / 2;
    Dec(i);
  end;

  digit[0] := (digit[0] - 128) or sign;

  Inc(ei,129);

  WriteLn(Format('C64 float = $%.2x $%.2x $%.2x $%.2x $%.2x',
                 [ei,digit[0],digit[1],digit[2],digit[3]]));
end;

When I passed in the value for PI calculated into the routine like I did in the other code I tried back in January:

Code: Select all

FloatToC64Float(4*ArcTan2(1,1))

I get the same output, so I have a good feeling about it

cheers,
Paul

Posted: Thu Jun 09, 2011 11:05 pm

by paul_nicholls

Just so you know, I have now created two routines to translate floating point numbers to C64 5-byte format (memory), and 6-byte format (FP accumulator registers) after reading this page:

ftp://n2dvm.com/Commodore/Commie-CDs/Ka ... se/197.htm

If anyone is interested, here is my code - feel free to use it if you wish

Code: Select all

type  
  PC64MemFloat = ^TC64MemFloat;
  TC64MemFloat = packed record
    Exponent: Byte;
    Mantissa: array[0..3] of Byte;
  end;

  PC64RegFloat = ^TC64RegFloat;
  TC64RegFloat = packed record
    Exponent: Byte;
    Mantissa: array[0..3] of Byte;
    Sign: Byte;
  end;

Code: Select all

procedure FloatToC64Float(num: Double; out aC64Float: TC64MemFloat);
// converts a floating point number to 5-byte memory FP representation: exponent (1), mantissa (4)
//ftp://n2dvm.com/Commodore/Commie-CDs/Kazez%20FREE-CD/c64-knowledge-base/197.htm
var
  ExpCount: Integer;
  SignBit: Integer;
  Index: Integer;
begin
  Write(Format('%.10f = ',[num]));

  // save sign bit
  SignBit := 0;
  if num < 0 then
  begin
    SignBit := 128;
    num := -num;
  end;

  if Abs(num) < 0.000001 then
  begin
    aC64Float.Exponent    := 0;
    aC64Float.Mantissa[0] := 0;
    aC64Float.Mantissa[1] := 0;
    aC64Float.Mantissa[2] := 0;
    aC64Float.Mantissa[3] := 0;

    C64FloatToStr(aC64Float);
    Exit;
  end;

  // calculate exponent byte
  ExpCount := 0;
  if num < 1 then
    while num < 1 do
    begin
      Dec(ExpCount);
      num := num * 2;
    end
  else
  if num >= 2 then
    while num >= 2 do
    begin
      Inc(ExpCount);
      num := num / 2;
    end;
  aC64Float.Exponent := 129 + ExpCount;

  num := num / 2; // 'un-normalize' it for forther processing (immediate mantissa)

  // calculate mantissa digits
  for Index := 0 to 3 do
  begin
    num := num * 256;
    aC64Float.Mantissa[Index] := Trunc(num);
    num := Frac(num);
  end;

  // round last mantissa digit when required
  if num > 0.5 then Inc(aC64Float.Mantissa[3]);

  // include sign bit in first mantissa digit
  aC64Float.Mantissa[0] := (aC64Float.Mantissa[0] and $7F) or SignBit;

  C64FloatToStr(aC64Float);
end;

Code: Select all

procedure FloatToC64Float(num: Double; out aC64Float: TC64RegFloat);
// converts a floating point number to 6-byte register FP representation: exponent (1), mantissa (4), separate sign (1)
//ftp://n2dvm.com/Commodore/Commie-CDs/Kazez%20FREE-CD/c64-knowledge-base/197.htm
var
  ExpCount: Integer;
  SignBit: Integer;
  Index: Integer;
begin
  Write(Format('%.10f = ',[num]));

  // save sign bit
  SignBit := 0;
  if num < 0 then
  begin
    SignBit := 128;
    num := -num;
  end;

  if Abs(num) < 0.000001 then
  begin
    aC64Float.Exponent    := 0;
    aC64Float.Mantissa[0] := 0;
    aC64Float.Mantissa[1] := 0;
    aC64Float.Mantissa[2] := 0;
    aC64Float.Mantissa[3] := 0;
    aC64Float.Sign        := 0;

    C64FloatToStr(aC64Float);
    Exit;
  end;

  // calculate exponent byte
  ExpCount := 0;
  if num < 1 then
    while num < 1 do
    begin
      Dec(ExpCount);
      num := num * 2;
    end
  else
  if num >= 2 then
    while num >= 2 do
    begin
      Inc(ExpCount);
      num := num / 2;
    end;
  aC64Float.Exponent := 129 + ExpCount;

  num := num / 2; // 'un-normalize' it for forther processing (immediate mantissa)

  // calculate mantissa digits
  for Index := 0 to 3 do
  begin
    num := num * 256;
    aC64Float.Mantissa[Index] := Trunc(num);
    num := Frac(num);
  end;

  // round last mantissa digit when required
  if num > 0.5 then Inc(aC64Float.Mantissa[3]);

  // include sign bit in sign part
  aC64Float.Mantissa[4] := SignBit;

  C64FloatToStr(aC64Float);
end;

Code: Select all

function  C64FloatToStr(var aC64Float: TC64MemFloat): String;
begin
  //output C64 mem floating point as hex (Exponent, Mantissa)
  Result := Format('$%.2x $%.2x $%.2x $%.2x $%.2x (mem FP)',
                 [aC64Float.Exponent,
                  aC64Float.Mantissa[0],
                  aC64Float.Mantissa[1],
                  aC64Float.Mantissa[2],
                  aC64Float.Mantissa[3]]);
end;

Code: Select all

function  C64FloatToStr(var aC64Float: TC64RegFloat): String;
begin
  //output C64 reg floating point as hex (Exponent, Mantissa, Sign)
  Result := Format('$%.2x $%.2x $%.2x $%.2x $%.2x $%.2x (reg FP)',
                 [aC64Float.Exponent,
                  aC64Float.Mantissa[0],
                  aC64Float.Mantissa[1],
                  aC64Float.Mantissa[2],
                  aC64Float.Mantissa[3],
                  aC64Float.Sign]);
end;

cheers,
Paul

Posted: Sun Nov 27, 2011 10:21 am

by jgharston

6502 code to do this also available here: http://mdfs.net/Info/Comp/6502/ProgTips/Denormal

Posted: Sun Nov 27, 2011 7:39 pm

by paul_nicholls

Thanks for the link jgharston

It might come in handy...not sure if I will use it yet though

cheers,
Paul

Re: Convert numbers to C64 floating point compact form?

Posted: Fri Jun 12, 2015 3:48 pm

by Hobbit1972

Sorry for necro-bumping, but there is a different approach using the standard IEEE floating point numbers available on most systems/high language libraries:

Code: Select all

Single:	4 byte	32seee eeee  24emmm mmmm  16mmmm mmmm  08mmmm mmmm
Double:	8 byte	64seee eeee  56eeee mmmm  48mmmm mmmm  40mmmm mmmm  32mmmm mmmm  24mmmm mmmm  16mmmm mmmm  08mmmm mmmm
C64Flp:	5 byte	16mmmm mmmm  24mmmm mmmm  32mmmm mmmm  48smmm mmmm  08 eeee eeee

s: sign bit
m: mantissa
e: exponent

Caveat: byte order C64 is reversed compared to IEEE!

If you have a string containing the floating-point number, you can convert it to CBM-floats using standard library functions, here in freepascal:

Code: Select all

function Str2C64Flp(floatstr : string) : string;´{result will contain 5-byte C64-float}
var	sr : record {those records for juggling the bytes}
	 case integer of
		1 : (s : single);
		2 : (b : array[0..3] of byte);
		3 : (l : longint);
		4 : (w1: word; w2: word);
	end;
	dr : record
	 case integer of
		1 : (d : double);
		2 : (b : array[0..7] of byte);
		3 : (c : comp);
		4 : (l1,l2 : longint);
	end;
	w	: word;
	vorzeichen : byte; {sign}
	exponent : byte;
	mantisse : longint;

begin
 dr.d := fval(floatstr,w); {one might want to catch some conversion errors here}
 sr.s := dr.d;
 vorzeichen := sr.b[3] shr 7;
 sr.l := sr.l shl 1;
 exponent := sr.b[3]+2;
 mantisse := (((dr.l2 shl 12) or (dr.l1 shr 20)) shr 1) + (vorzeichen *$80000000);
 sr.l := mantisse;
 Str2C64Flp := chr(exponent)+chr(sr.b[3])+chr(sr.b[2])+chr(sr.b[1])+chr(sr.b[0]);
end;

(Stumbled over this old thread to see if someone already has done a 65xx-Pascal-compiler, mine is a stub for years now

)

Re: Convert numbers to C64 floating point compact form?

Posted: Mon Jun 15, 2015 10:38 am

by paul_nicholls

@Hobbit1972 thanks for the info

I haven't worked on my compiler for ages now, but if I do this could come in handy...not that it is the memory compact form (4 bytes) I was really after though.

cheers,
Paul