The algorithm in question is here:
http://6502org.wikidot.com/software-math-fastdiv
It is the handiwork of our resident guru dclxvi (Bruce), but I am too dense to figure out how it works from the cryptic explanation and non-existent comments. For the 8-bit version, I am supposed to call this routine with a 39-bit number stored in R, and the quotient quickly appears in the accumulator, with amazing efficiency.
Very impressive ... but ...
My question to all is "how am I supposed to retrieve a quotient from a single number?" Don't I have to provide two numbers, a dividend and a divisor? Or is one of them assumed to be a fixed value? In either case, how do I encode the input(s)? If any of you can help me figure out how it works, I would like to expand it to 32-bits for my 65m32.
Thanks to all,
Mike
I need some help understanding how to use this algorithm.
- barrym95838
- Posts: 2056
- Joined: 30 Jun 2013
- Location: Sacramento, CA, USA
I need some help understanding how to use this algorithm.
Last edited by barrym95838 on Tue May 20, 2014 6:23 am, edited 1 time in total.
-
White Flame
- Posts: 704
- Joined: 24 Jul 2012
Re: I need some help understanding how to use this algorithm
It looks like a reciprocal, not division.
- barrym95838
- Posts: 2056
- Joined: 30 Jun 2013
- Location: Sacramento, CA, USA
Re: I need some help understanding how to use this algorithm
White Flame wrote:
It looks like a reciprocal, not division.
Mike
Re: I need some help understanding how to use this algorithm
Multiplying by a power of 2 is easy, just use N ASLs. Multiplying by, e.g. 10, isn't much harder:
To use a different multiplier, e.g. 13, you must change the code (again, untested):
Likewise multiply-by-31 can be implemented as a*32-a. Some mulitpliers are more convenient than others. 10 works out great, whether you're multiplying an 8-bit, 16-bit, or 32-bit number, because 10 has only 2 bits that are ones. On the other hand, using this technique to multiply, e.g., a 32-bit number by $A9D6 may not be worth it because there are a whole bunch of one bits in $A9D6.
Calculating the remainder when the divisor is a power of 2 is also easy, e.g. mod 8 is AND #7. The analogous technique when the divisor is not a power of 2 is here:
http://6502org.wikidot.com/software-math-fastmod
Likewise, the routine must be changed when a different divisor is used, and some divisors will be more convenient than others.
Likewise, calculating the quotient when the divisor is a power of 2 is easy, just use N LSRs. The routine linked in the original post is the analog for calculating quotients when the divisor is not a power of 2. Again, the routine must be changed for a different divisor and some divisors will be more convenient than others.
Remember repeating decimals back when you learned them in school? e.g.:
i.e. 1/13 = (7/100 + 6/1000 + 9/10000 + 2/100000 + 3/1000000) + ...
In base 2:
.1 = 1/2
.01 = 1/4
.001 = 1/8
etc.,
i.e. 1/13 = (1/16 + 1/128 + 1/256 + 1/512 + 1/2048 + 1/4096) + ...
So the basic technique is to shift right and add. There are a couple of complications, however. First, there are an infinite number of terms, so you have to determine how many terms you need. You'll need more terms to divide a 32-bit number by 13 than to divide an 8-bit number by 13. (This differs from multiplication by 13, where there are always 3 terms, N*8 + N*4 + N, regardless of whether N is 8, or 16, or 32 bits.)
Second, since you are shifting right, you are losing the least significant bit as it gets shifted out of the register/variable. Thus, there are three possible things to precede the ADC: a CLC, a SEC, or nothing (i.e. use the bit you just shifted out as the carry). Here is the 8-bit routine for divide by 13:
Here is the 8-bit routine for divide-by-10:
Code: Select all
STA TMP
ASL
ASL ; a*4
CLC
ADC TMP ; a*4+a = a*5
ASL ; (a*4+a)*2 = a*10
Code: Select all
STA TMP
ASL
CLC
ADC TMP ; a*2+a = a*3
ASL
ASL ; (a*2+a)*4 = a*12
CLC
ADC TMP ; (a*2+a)*4+a = a*13
Calculating the remainder when the divisor is a power of 2 is also easy, e.g. mod 8 is AND #7. The analogous technique when the divisor is not a power of 2 is here:
http://6502org.wikidot.com/software-math-fastmod
Likewise, the routine must be changed when a different divisor is used, and some divisors will be more convenient than others.
Likewise, calculating the quotient when the divisor is a power of 2 is easy, just use N LSRs. The routine linked in the original post is the analog for calculating quotients when the divisor is not a power of 2. Again, the routine must be changed for a different divisor and some divisors will be more convenient than others.
Remember repeating decimals back when you learned them in school? e.g.:
Code: Select all
______
1/13 = .076923
In base 2:
.1 = 1/2
.01 = 1/4
.001 = 1/8
etc.,
Code: Select all
____________
1/13 = .000100111011
So the basic technique is to shift right and add. There are a couple of complications, however. First, there are an infinite number of terms, so you have to determine how many terms you need. You'll need more terms to divide a 32-bit number by 13 than to divide an 8-bit number by 13. (This differs from multiplication by 13, where there are always 3 terms, N*8 + N*4 + N, regardless of whether N is 8, or 16, or 32 bits.)
Second, since you are shifting right, you are losing the least significant bit as it gets shifted out of the register/variable. Thus, there are three possible things to precede the ADC: a CLC, a SEC, or nothing (i.e. use the bit you just shifted out as the carry). Here is the 8-bit routine for divide by 13:
Code: Select all
STA TMP
LSR
LSR ; a/4
ADC TMP ; a/4 + a
ROR ; a/8 + a/2
ADC TMP ; a/8 + a/2 + a
ROR ; a/16 + a/4 + a/2
ADC TMP ; a/16 + a/4 + a/2 + a
ROR
LSR
LSR ; a/128 + a/32 + a/16 + a/8
SEC
ADC TMP ; a/128 + a/32 + a/16 + a/8 + a
ROR
LSR
LSR
LSR ; a/2048 + a/512 + a/256 + a/128 + a/16
Code: Select all
; ____
; 1/10 = .1 decimal = .00011 binary
; = (1/16 + 1/32) + (1/256 + 1/512) + (1/4096 + 1/8192) + ...
STA TMP
LSR
LSR
LSR ; a/8
ADC TMP ; a/8 + a
ROR ; a/16 + a/2
ADC TMP ; a/16 + a/2 + a
ROR
LSR
LSR ; a/128 + a/16 + a/8
ADC TMP ; a/128 + a/16 + a/8 + a
ROR ; a/256 + a/32 + a/16 + a/2
SEC
ADC TMP ; a/256 + a/32 + a/16 + a/2 + a
ROR
LSR
LSR
LSR ; a/4096 + a/512 + a/256 + a/32 + a/16
- barrym95838
- Posts: 2056
- Joined: 30 Jun 2013
- Location: Sacramento, CA, USA
Re: I need some help understanding how to use this algorithm
Thanks for responding, dclxvi!
I can understand what you're saying here (above), and I can even see how the bit pattern of the reciprocal is reflected in your nicely commented examples. I don't grok how and where you determined the need for the SECs, but I can definitely see the big picture.
What I can't understand is how to make proper use of this (from the linked page):
My tired old brain can't take the next step in understanding what you're doing in this code, at least not without some coaching. Could I impose on you to walk me (us?) through a sample case or two, showing the relationship between the input and output of this little gem? I can feed it a bit pattern and retrieve the result, but I don't know how to relate them to actual numbers in a meaningful or useful way. For example, if I feed it 39 zeros, what does that mean, and what does the resulting bit pattern mean?
Thanks in advance,
Mike
I can understand what you're saying here (above), and I can even see how the bit pattern of the reciprocal is reflected in your nicely commented examples. I don't grok how and where you determined the need for the SECs, but I can definitely see the big picture.
What I can't understand is how to make proper use of this (from the linked page):
Code: Select all
ASL R3
LDA R4
ROL
SEC
PHA
ADC R0
PLA
BCC .1
INC R1
BNE .1
INC R2
BNE .1
INC R3
INC R3
BNE .1
ADC #0
.1
Thanks in advance,
Mike
Re: I need some help understanding how to use this algorithm
I'd argue that you should defer the division to retain
as much accuracy in the partial products as possible
1/10 = .000110011
(untested)
If you start by dividing by 8
You've thrown away three bits that could participate
For repeating decimals (binamals?) you can form a
partial product that repeats.
I don't see the point of setting the carry.
Rounding happens automatically if you don't
clear the carry.
It maybe be possible to get by with fewer partial
products by a judicious use of sec but I think
that would be rare.
as much accuracy in the partial products as possible
1/10 = .000110011
(untested)
Code: Select all
sta temp 1
lsr .1
adc temp 1.1
ror .11
lsr
lsr
lsr .00011
adc temp 1.00011
ror .100011
adc temp 1.100011
ror
lsr
lsr
lsr
lsr .0001100011
You've thrown away three bits that could participate
For repeating decimals (binamals?) you can form a
partial product that repeats.
Code: Select all
sta temp 1
lsr
adc temp 1.1
ror
sta temp .11
lsr
lsr
lsr
lsr .000011
adc tmp .110011
ror
lsr
lsr .000110011
Rounding happens automatically if you don't
clear the carry.
It maybe be possible to get by with fewer partial
products by a judicious use of sec but I think
that would be rare.
Re: I need some help understanding how to use this algorithm
barrym95838 wrote:
For example, if I feed it 39 zeros, what does that mean, and what does the resulting bit pattern mean?
8-bit version:
Code: Select all
; compute floor ($123456789A / $007FFFFFFF)
;
LDA #$9A ; lsb of numerator
STA R0
LDA #$78
STA R1
LDA #$56
STA R2
LDA #$34
STA R3
LDA #$12 ; msb of numerator
STA R4
JSR DIV
JSR OUTHEX ; output quotient
Code: Select all
; compute floor ($123456789ABC / $00007FFFFFFF)
;
LDA #$BC ; lsb of numerator
STA R0
LDA #$9A
STA R1
LDA #$78
STA R2
LDA #$56
STA R3
LDA #$34
STA R4
LDA #$12 ; msb of numerator
STA R5
JSR DIV
LDA R5 ; msb of quotient
JSR OUTHEX
LDA R4 ; lsb of quotient
JSR OUTHEX
bogax wrote:
If you start by dividing by 8
You've thrown away three bits that could participate
You've thrown away three bits that could participate
Code: Select all
LDX #0
STX QUOT
STX REM
STX MAXERROR
LDY #0
@A TXA
JSR DIV10
CMP QUOT
BEQ @B
JSR OUTNE
@B INC REM ; increment the remainder
LDA REM ; every 10th time, reset the remainder and increment the quotient
CMP #10
BCC @C
LDA #0
STA REM
INC QUOT
@C INX
BNE @A
LDA MAXERROR ; output maximum error
JSR OUTHEX
RTS
OUTHEXSP
JSR OUTHEX
LDA #$20
JMP OUTPUT
GETERROR ; compute abs (actual quotient - expected quotient)
SEC
SBC QUOT
BCS @A
EOR #$FF ; compute -A (carry is clear here)
ADC #1
@A RTS
STOREERROR
CMP MAXERROR
BCC @A
STA MAXERROR
@A RTS
OUTNE
PHA
TYA
JSR OUTHEXSP ; output error count
TXA
JSR OUTHEXSP ; output numerator
PLA
PHA
JSR OUTHEXSP ; output actual quotient
LDA QUOT
JSR OUTHEXSP ; output expected quotient
PLA
JSR GETERROR
JSR STOREERROR
JSR OUTHEX ; output error, i.e. abs (actual quotient - expected quotient)
INY ; increment error count
JMP OUTCRLF
MAXERROR .ds 1
QUOT .ds 1
REM .ds 1
bogax wrote:
For repeating decimals (binamals?) you can form a
partial product that repeats.
partial product that repeats.
bogax wrote:
It maybe be possible to get by with fewer partial
products by a judicious use of sec but I think
that would be rare.
products by a judicious use of sec but I think
that would be rare.
Re: I need some help understanding how to use this algorithm
I see Omegamatrix has posted his compilation of division by a constant routines to nesdev