6502.org
http://forum.6502.org/

Notes on Weitek's memory-mapped FPUs
http://forum.6502.org/viewtopic.php?f=10&t=4413
Page 1 of 1

Author:  BigEd [ Tue Feb 14, 2017 12:31 pm ]
Post subject:  Notes on Weitek's memory-mapped FPUs

Here are some musings which could relate to an FPU for a 6502 system. (We've discussed some FPU products here, but I'm thinking of projects rather than products.)

Back when x86 first became popular, it offered two approaches to floating point hardware: Intel made the x87 devices which were initially like a coprocessor, taking instructions from the instruction stream and having a small register stack inside. Weitek made their own devices, initially a chip set and then integrated, which connected as a memory-mapped device, which was a much more time-efficient interface. They also had a large internal register file, and a command queue, so several stages of a computation could be in-flight while the CPU did something else. They lacked some things which Intel offered: an 80-bit internal format for extra accuracy, support for denormalised numbers, and transcendental functions.

The memory-mapping was a bit large for our purposes: a 64k block. That gives 16 address lines, together with (at least) 8 data lines, for each action. Apparently 6 of the address lines were used as a command, the rest as register specifiers.

In a 6502 system, it would be reasonable for an I/O device to occupy as much as 256 addresses but perhaps not more than that. So we'd communicate 16 bits for each write - 8 bits from the address bus and 8 from the databus.

Any ideas for a nice architecture?

Ref:
Weitek 3167 datasheet (pdf).
Overview of FPUs in x86 land: coproc.txt.
Wikipedia on Weitek.

Attachment:
File comment: From 1988 datasheet at bitsavers
WTL 3167 FPU.png
WTL 3167 FPU.png [ 103.77 KiB | Viewed 2053 times ]

Author:  Rob Finch [ Wed Feb 15, 2017 3:57 am ]
Post subject:  Re: Notes on Weitek's memory-mapped FPUs

Are you suggesting the Weitek FPU could be interfaced to a 6502 ? There would have to be a several latches for the 16 bit addressing and controls. Otherwise some other sort of chip FPGA, micro-controller with FPU would be interfaced.
256 bytes is only enough room for 32 x 64 bit registers to be memory mapped, not including control and operation registers. So the chip would have to have about 30 or fewer registers or a window into the register set. If using block rams in an FPGA it could have many more registers which would be good for partitioning them according to task.

Even a simple FP unit it likely to require several times more logic resources than the 6502.

Author:  BigEd [ Wed Feb 15, 2017 9:37 am ]
Post subject:  Re: Notes on Weitek's memory-mapped FPUs

I was thinking of borrowing the architectural ideas, for an FPU suitable for memory-mapping into a 16-bit address space like the 6502's. Agreed, the FPU is likely to be bigger than a 6502, if it were sharing the FPGA, but that's not terribly important. It's just as interesting if the FPGA contains only the FPU and connects to the bus of a physical 6502 as a peripheral.

I don't think a huge register file is so important. Indeed, a stack could be interesting. The point of the memory-mapping is not only to give a window into the register file, but also to encode commands. Reading and writing the register file are just two commands - there must be 8 or 16 operations that an FPU can usefully do, and of course you need some bits to specify registers.

A simple sketch would be 4 bits for operation, 4 bits for destination, 4 bits twice for each of the sources. So, a 16-element register file would fit there. As you say, you could have multiple banks - in one of my refs you'll see an FPU which had 4 banks, specifically so it could operate on 4x4 matrices and 4x1 vectors, which is very handy for 3D computations.

Author:  kakemoms [ Tue Feb 21, 2017 1:51 pm ]
Post subject:  Re: Notes on Weitek's memory-mapped FPUs

This sounds very interesting. I have been thinking of a cut-down version of a 6502 core for simple math coprocessing, but maybe a simple FPU is the way to go. The question that usually arise when I try to draft something is how complex the math needs to be. I mean, even 3D graphics can usually cope with 16-bit integer math as long as the speed is there. But maybe a scale-able FPU that can be compiled for 16-64 bits (according to what is needed) is the way to go? Even 16-bit FP numbers are a great tool for 3D, but sometimes one needs more accuracy.

The question then becomes; is such a thing possible to scale at all? Or would we end up with 3-4 different cores..

Author:  BigEd [ Tue Feb 21, 2017 2:10 pm ]
Post subject:  Re: Notes on Weitek's memory-mapped FPUs

I would think a parameterised design should be OK up to some width. Maybe up to 53 bits? See
https://www.xilinx.com/support/document ... _ds255.pdf
Notice how much slower a larger multiplier is.

The nice thing about 6502 is that it has very pedestrian floating point performance, so any acceleration is going to be a big improvement. It's probably not a very sensible starting point for number crunching in double precision, but everything depends on your application, or which itch you're scratching.

(Edit: I reckon 53 bits is not so much a limit, as a natural stopping off point, for double precision)

Author:  GARTHWILSON [ Tue Feb 21, 2017 8:23 pm ]
Post subject:  Re: Notes on Weitek's memory-mapped FPUs

kakemoms wrote:
The question that usually arise when I try to draft something is how complex the math needs to be. I mean, even 3D graphics can usually cope with 16-bit integer math as long as the speed is there.

Like a 16-bit scaled-integer fast math coprocessor: Large downloadable tables (in Intel Hex files) with every single answer already pre-calculated, requiring no interpolation:
http://wilsonminesco.com/16bitMathTables/

Author:  BigDumbDinosaur [ Tue Feb 21, 2017 9:06 pm ]
Post subject:  Re: Notes on Weitek's memory-mapped FPUs

kakemoms wrote:
The question that usually arise when I try to draft something is how complex the math needs to be.

From my own experience in writing assembly language software (which now spans some 47 years), the number of times in which I needed complex, floating point math was relatively small. Most of the math I've done has been 16- or 32-bit integer stuff, almost always four-function math.

Author:  BigEd [ Tue Feb 21, 2017 9:13 pm ]
Post subject:  Re: Notes on Weitek's memory-mapped FPUs

Part of me is attracted to scaled integers - perhaps it's the slide rule enthusiast in me. But if you look at the history of computing, fixed point was a hindrance to getting things done, and a source of bugs, and floating point was quite a step forward - even though it has many wrinkles and surprises in it.

Of course, I agree with BDD, that much of the time integers are all you need. So much depends on what you're doing with the machine.

Author:  GARTHWILSON [ Tue Feb 21, 2017 11:54 pm ]
Post subject:  Re: Notes on Weitek's memory-mapped FPUs

BigEd wrote:
But if you look at the history of computing, fixed point was a hindrance to getting things done, and a source of bugs, and floating point was quite a step forward

I assume you did mean "scaled-integer," but I try to be careful to avoid saying "fixed-point," since that's a limited subset of scaled-integer.

Scaled-integer shifts some of the workload off of the computer and onto the programmer, improving the computer's performance but making programming slightly more challenging. Of course some situations really do require floating-point, but not as many as people think. I myself had to be convinced.

Author:  Rob Finch [ Wed Feb 22, 2017 1:05 am ]
Post subject:  Re: Notes on Weitek's memory-mapped FPUs

Quote:
The question then becomes; is such a thing possible to scale at all? Or would we end up with 3-4 different cores..

Quote:
I would think a parameterised design should be OK up to some width.

I think the easiest solution when multiple precision is required at the same time is to use multiple cores. Including a bunch of case logic for different precisions would slow down the arithmetic and may use just as many transistors as having multiple cores.
A single core could be used with parameters and instanced multiple times to get the desired precisions.
There is some parameterized FP components (add, sub, div, mul) at opencores.org (FT816float) project. No square root yet. I've not been able to get square root to work accurately. The parameters support many different FP formats, 16,24,32,40,64,80,128, and others. Of course component testing needs to be addressed. The FP components are pipelinable. In order to use the components with a 6502 some sort of FPUnit interface is required.

Page 1 of 1 All times are UTC
Powered by phpBB® Forum Software © phpBB Group
http://www.phpbb.com/