Some 65Org16 questions

ElEctric_EyE · Post by **ElEctric_EyE** » Sun Jul 08, 2012 4:54 pm

All I've done is experimented with writing to video. It seemed to work when I was writing zeroes, but more testing is needed. I was being hampered by my design which is right on the 100MHz limit. Any small change to any part of the code, and I had to run smartXplorer for it to pass synthesis. I'm taking brief time off from it to regather my thoughts.

Arlet · Post by **Arlet** » Sun Jul 08, 2012 4:58 pm

I recommend switching to a lower frequency, for easy synthesis and testing, and not worry about maximum speed until the design is stable.

ElEctric_EyE · Post by **ElEctric_EyE** » Sun Jul 08, 2012 7:14 pm

Good advice. I should've known that. The fact I overlooked it proves that I need a rest. The .b core took alot out of me. In a few weeks I'll be ready to dive back in refreshed. Hopefully these 100+ degree days I work in will be done with as well!

BigEd · Post by **BigEd** » Sun Jul 08, 2012 7:48 pm

Thanks for the clarification! Have a good break, too - you've earnt it!

bogax · Post by **bogax** » Wed Aug 08, 2012 7:14 pm

Somehow I missed this when it was posted.
I have a couple of comments.

GARTHWILSON wrote:

Miles J. wrote:

BigEd wrote:

A simple MUL can be dropped in, I think. Division seems to be difficult, in the sense that it's a multi-cycle operation and there's no drop-in hardware for it, so that's less likely unless someone pops up. A division-step instruction might be more likely.

Okay, no problem. For now I will use a little subroutine for division when needed. MUL would be more useful anyway, I guess (e.g. for calculating the pixel address inside a window with arbitrary size).

Don't forget you can use the big math tables at http://wilsonminesco.com/16bitMathTables/index.html, if you can spare the I/O for them, or load them into RAM. One of the math tables I provide is for inverting, so to divide, you can multiply by the inverse. The input number is 16 bits, and the inverse is 32-- not that you have to use all 32, but it lets you get 16-bit resolution and accuracy across the entire range.

For fast multiplication, you can speed it up with the multiplication table which goes to 255x255, or perhaps better, the table of squares which has 16-bit input and 32-bit output, and consider that:

(a+b)² = a² + b² + 2ab

so if you solve for a*b, the multiplication becomes:

ab = ( (a+b)² - a² - b² ) / 2

meaning it is reduced to an addition, three squarings (from the table), two subtractions, and a right shift.

These two particular tables are unsigned.

The quarter-square multiply (usually sourced from cbmhacking, don't remember which one)
and attributed to George Taylor (who attributed the idea to some one in Australia)

uses AB=((A+B)²-(A-B)²)/4

The really clever bit is that the A-B (as an addition) and the division by 4 are built into
the tables.
The A+B and the A-B are done by indirect indexing.
Once B and -B are set up in zp it's just a 16 bit subtraction (for an 8*8 multiply)
(so if B is constant the overhead of setting up B can be amortized over several multiplications)

For division I've been considering a modified Goldschmidt's algorithm.

roughly:

Code: Select all

r=n*1/n   (8 bit reciprocal from a table)
p=(r-1)²  (presumably the squaring could be sped up with any
           quarter-square mutiply squares table you happen to have handy)
e=2-r
r=e

iterate on
 (
 e=e*p    (e is an error term)
 r=r+e
 )
end up by scaling r by the 8 bit reciprocal from the table

r=r*1/n = reciprocal n

n*1/n ends up being close to 1
so (r-1)² has leading zeros you don't have to multiply
(8 bits of reciprocal may not be enough though)
e is an increasingly small error term lots of leading zeros there

but it would involve scaling

GARTHWILSON · Post by **GARTHWILSON** » Wed Aug 08, 2012 8:02 pm

Thanks. I haven't looked at the details yet, but I added a note referencing your post in the section of my website on hyperfast, accurate math with 16-bit look-up tables, in the descriptions page. (The 16-bit table of squares gives a 32-bit output.) When I was writing that up, I looked at a page someone here linked to, but in checking it out, found they had one of the signs in the equation wrong.

Some 65Org16 questions

Re: Some 65Org16 questions

Re: Some 65Org16 questions

Re: Some 65Org16 questions

Re: Some 65Org16 questions

Re: Some 65Org16 questions

Re: Some 65Org16 questions