Fast 2D rotation for 6510

8BIT · Post by **8BIT** » Thu Sep 26, 2002 10:39 pm

For anyone interested, here's the Quick Basic version of table-based 2D rotations.

There's the BASIC source file and the compiled version which rotated a test pattern around 360 degrees in 10 degree increments. It has a 1 second delay between rotations. You can press the <Space Bar> to speed it up.

I didn't have enough time to do it in assembly, but it should be a straight forward process.

Download the Zip'd file from my web site (I'll leave it there for 30 days).

(Right Click and select "save target as"):
http://65c02.tripod.com/2drot.zip

"Undo", I hope this helps. You'll only need 4K for the tables. I don't know how much for the code. 6K may have been a little large.

Daryl

yzoer · Post by **yzoer** » Mon Nov 11, 2002 6:59 pm

Hi Everyone,

I just had to get my $0.02 in and make this my first post

Steve Judd ( I think ) posted something a long long time ago which is a pretty good trade off between speed & accuracy; a 256 byte table is all you need + some trickery

Fast multiply: don't blink or you'll miss it.

Let f(x) = x^2 / 4. Then

a*b = f(a+b) - f(a-b)

Thus with a table of squares you can do integer multiplies very quickly.
To see an implementation of this algorithm, see C=Hacking.

check it out at www.ffd2.com for more info. I've used it successfully on the GBC and some C64 stuff a long time ago..

Yvo

65xxman · Post by **65xxman** » Wed Jan 08, 2003 9:50 am

Sounds like this thing.
http://www.cfxweb.net/modules.php?name= ... le&sid=664
Btw, there is a method based on shears,
12
34
----
12
34 (+x shear)
---
31
42 (+y shear)

and there is variations that rotate by other odd degrees like 22, read a paper once..
use something like a bresenham to get weird shears..
http://www.cfxweb.net/modules.php?name= ... =0&thold=0

kc5tja · Post by **kc5tja** » Wed Jan 08, 2003 7:09 pm

yzoer wrote:

I just had to get my $0.02 in and make this my first post

Steve Judd ( I think ) posted something a long long time ago which is a pretty good trade off between speed & accuracy; a 256 byte table is all you need + some trickery

You actually need a 1024-byte table. Here's why:

If a and b are 8-bits, then a+b is 9 bits ($FF+$FF = $1FE). Also, the square of an 8-bit number is a 16-bit number. So, 512 entries at 16-bits each yields 1024 bytes devoted to table data.

That being said, however, you can still use 8-bit arithmetic to address it using some of those "tricks" you've been talking about.

For example:

Code: Select all

   ma = $F0 ; input A
   mb = $F1 ; input B
   mapb = $F2 ; temporary: ((A+B)^2)/4; also holds results
   mamb = $f4 ; temporary: ((A-B)^2)/4

mul8x8:
   clc  ; Compute ((A+B)^2)/4
   lda ma
   adc mb
   bcs 1$
   tax
   lda sqtable00,x
   sta mapb
   lda sqtable01,x
   sta mapb+1
   jmp 2$
1$:
   tax
   lda sqtable10,x
   sta mapb
   lda sqtable11,x
   sta mapb+1

2$:
   sec  ; Compute ((A-B)^2)/4
   lda ma
   sbc mb
   bcc 3$
   tax
   lda sqtable00,x
   sta mamb
   lda sqtable01,x
   sta mamb+1
   jmp 4$
3$:
   tax
   lda sqtable10,x
   sta mamb
   lda sqtable11,x
   sta mamb+1

4$:
   sec  ; Subtract them to get AB.
   lda mapb
   sbc mamb
   sta mapb
   lda mapb+1
   sbc mamb+1
   sta mamb+1
   rts

The key here is the division of the lookup table's duties:

sqtable00 is the table of square's for sums 0-255, low byte.
sqtable01 is the table of square's for sums 0-255, high byte.
sqtable10 is the table of square's for sums 256-511, low byte.
sqtable11 is the table of square's for sums 256-511, high byte.

NOTE: For this to all fit in 16-bits or less, you must store the squares pre-divided by four. That is, the second entry for sqtable00 isn't 4, as you'd expect, it's 1.

Drop any fraction, since the fractional components would get subtracted out in the end anyway.

NOTE: This is an unsigned multiply. You need a wrapper around this multiplication code to handle signed multiplications. But this is true of most 6502 multiplication subroutines anyway.

The 65816 represents a more convenient package to do this type of multiplication with (note: this is my first attempt at coding 65816-version of the above code, so it might have some errors, and could probably stand more optimization):

Code: Select all

   ma = $F0   ; input A (byte!)
   mb = $F2   ; input B (byte!)
   mapb = $F4
   mamb = $F6

mul8x8_16:
   longa on   ; Assume accumulator is 16-bits wide,
   longi on   ; as are the index registers.

   sep #MF   ; But first, zero the high-bytes of the input parameters
   stz ma+1   ; guarantees the subroutine will work without additional
   stz mb+1   ; error handling.
   rep #MF

   sec   ; Compute ((A-B)^2)/4 first.
   lda ma
   sbc mb
   and #$01FF   ; AND in case the result is negative; confines result into valid index bounds
   asl a
   tax
   lda sqtable,x
   sta mamb

   clc   ; Compute ((A+B)^2)/4 next.  We do this here so we avoid having
   lda ma   ; to refetch its value from memory.  It's already in A, so we just...
   adc mb
   asl a   ; Note we don't need to AND for addition.  High bits guaranteed to be clear.
   tax
   lda sqtable,x

   sec
   sbc mamb   ;... subtract the other term, and A now has the product.
   rts

As usual, sqtable is a table of squares predivided by four. However, it's formatted as 512 entries of two bytes each -- still 1KB in size, but more "logical" to address.

Interestingly enough, a 16x16=32 multiplication subroutine for the 65816 would take on the same form as the mul8x8 routine for the 6502, with similar performances. But it would require a 512KB square table.

It's probably better to use mul8x8_16 four times and add partial products instead, or to use an early-out "traditional" multiplication routine.

-- algebraic derivation for those who don't believe --

ab = ((a+b)^2)/4 - ((a-b)^2)/4

ab = ((a+b)^2 - (a-b)^2)/4

ab = ((a^2+2ab+b^2) - (a^2-2ab+b^2))/4

ab = (a^2+2ab+b^2-a^2+2ab-b^2)/4

ab = (a^2-a^2+2ab+2ab+b^2-b^2)/4

ab = (4ab)/4

ab = ab

TMorita · Post by **TMorita** » Thu Jan 09, 2003 9:19 pm

Anonymous wrote:

I think I can give you a method using look-up tables and no multiplication/division if you can afford about 6k for the tables & code.

The basic idea is pre-calculate the new x,y position from the old one for each pixel. Since you are rotating about the center, you only have to calculate one quarter of the screen. The other three will be the same offsets, they'll just have different signs. Since you will be limiting the angles to 10,20,30, ect, only 8 tables would be needed. Each table will hold 2 bytes for each of the 256 pixels in the first quadrant (the new x and new y). This method will also eliminate the need to check to see if a pixel was rotated "off the screen".

Let me know if this method sounds practical.

Daryl

Forward mapping is messy. You wind up overwriting pixels multiple times.

A better solution is to use reverse mapping. This only requires calculating four rotations for the corners, then it's linear interpolation for the rest of the pixels.

Also, this gives you scaling for free.

Toshi

Fast 2D rotation for 6510

2D Rotation using tables

2d rotations

Re: 2d rotations