Page 1 of 2
Daud Zoss' unusual multiplication routine
Posted: Sat Dec 22, 2012 7:31 pm
by BigEd
This multiplication routine takes constant time, and uses no memory other than the stack. More detail
here.
Let me begin by saying that this is not the correct way to do multiplication on the 6502
Code: Select all
; unsigned multiply by Daud Zoss
; as seen at http://swissembly.wordpress.com/2012/12/15/8-bit-multiply-in-place-on-6502/
;
; parameters in X and Y
; result (low byte only) in X
um8:
pha
tya
pha
txa
tsx
ldy #$08
loop1:
pha
dex
asl
dey
bne loop1
lda $0109,x
ldy #$08
loop2:
asl
bcs skip1
pha
lda #$00
sta $0101,x
pla
skip1:
inx
dey
bne loop2
ldx #$00
ldy #$08
loop3:
txa
tsx
clc
adc $0101,x
tax
pla
dey
bne loop3
pla
tay
pla
I've tested this on easy6502 and seems to work. I haven't got my head around it!
Cheers
Ed
Re: Daud Zoss' unusual multiplication routine
Posted: Sat Dec 22, 2012 7:58 pm
by Arlet
It's made from 3 loops. The first loop writes 8 values to the stack: x, 2*x, 4*x, 8*x, ... 128*x. The second loop checks 'y' for bits that are cleared. For each cleared bit, it zeroes out the corresponding 2^n*x value in the table. In the 3rd loop, the 8 values in the table are summed.
Re: Daud Zoss' unusual multiplication routine
Posted: Sat Dec 22, 2012 8:21 pm
by BigEd
Thanks for the explanation!
Re: Daud Zoss' unusual multiplication routine
Posted: Sat Dec 22, 2012 10:17 pm
by Klaus2m5
Inspired by above code. Less code bytes, less stack usage, less cycles = less everything.
Code: Select all
; unsigned multiply using stack as operand storage - 6502.org style
;
; parameters in X and Y
; result (low byte only) in X
um8: pha ;save a, y is not altered
tya ;move operands to stack
pha
txa
eor #$ff ;invert carry of op1
pha
lda #0 ;init result
tsx ;get index to operands
loop: lsr $101,x ;test op1 2^0 to 2^7
bcs skip
adc $102,x ;result =+ op2
skip: asl $102,x ;op2 =* 2
bne loop ;until nothing more to add
tax ;move result to x
pla ;discard operands
pla
pla ;restore a
rts
Re: Daud Zoss' unusual multiplication routine
Posted: Sun Dec 23, 2012 3:03 am
by GARTHWILSON
Am I missing something, or could you omit the EOR #$FF and change the BCS to BCC like the following? (I like to apply the structure macros to make it a little more clear
Code: Select all
; unsigned multiply using stack as operand storage - 6502.org style
;
; parameters in X and Y
; result (low byte only) in X
um8: PHA ; Save A. Y is not altered. A is saved
; only in case the calling routine needed it.
PHY ; Move operands to stack.
PHX ; Now SP+2 has one multiplicand and SP+1 has the other.
LDA #0 ; Init result.
TSX ; Get index to operands.
BEGIN
LSR $101,X ; Test op1 2^0 to 2^7.
IF_CARRY_SET
ADC $102,X ; Result =+ op2
END_IF
ASL $102,X ; op2 =* 2
UNTIL_ZERO ; until nothing more to add.
TAX ; Move result to X.
PLA ; Discard operands.
PLA
PLA ; Restore A.
RTS
;----------------
Actually this is a pretty standard way to do multiplication, except maybe the part about putting the inputs on the hardware stack.
Re: Daud Zoss' unusual multiplication routine
Posted: Sun Dec 23, 2012 3:34 am
by GARTHWILSON
This multiplication routine takes constant time, and uses no memory other than the stack. More detail
here.
Let me begin by saying that this is not the correct way to do multiplication on the 6502
It is definitely a very curious way to do it!
Here's my processing of the original, using FOR_Y...NEXT_Y which I'll be adding to the structure-macro pages. I reduced it some by using CMOS instructions also. One series that was especially more efficient was the PHA LDA#0 STA addr,X PLA (ie, four instructions) being replaced by just STZ addr,X (a single instruction).
Code: Select all
; unsigned multiply by Daud Zoss
; as seen at http://swissembly.wordpress.com/2012/12/15/8-bit-multiply-in-place-on-6502/
;
; parameters in X and Y
; result (low byte only) in X
um8: PHA
PHY
TXA
TSX
FOR_Y 8, DOWN_TO, 0
PHA
DEX
ASL
DEY
NEXT_Y
LDA $109,X
FOR_Y 8, DOWN_TO, 0
ASL
IF_CARRY_CLR
STZ $101,X
END_IF
INX
DEY
NEXT_Y
LDX #0
FOR_Y 8, DOWN_TO, 0
TXA
TSX
CLC
ADC $101,X
TAX
PLA
NEXT_Y
PLY
PLA
Re: Daud Zoss' unusual multiplication routine
Posted: Sun Dec 23, 2012 4:22 am
by MichaelM
I believe Klaus' EOR is used to avoid the CLC required before the ADC. When C is not set the ADC is performed, otherwise just a shift is performed. Saves 2 cycles per loop.
Code: Select all
loop: lsr $101,x ;test op1 2^0 to 2^7
bcs skip
adc $102,x ;result =+ op2
skip: asl $102,x ;op2 =* 2
Re: Daud Zoss' unusual multiplication routine
Posted: Sun Dec 23, 2012 4:52 am
by GARTHWILSON
Here's another way to multiply, and a very strange way, the Russian peasants' multiplication (see
http://mathforum.org/dr.math/faq/faq.peasant.html). I have not tested it.
Code: Select all
ORG $0300
LDA $0340
STA 00 ; Put 1st # in ZP reg 00
LDA $0341
STA 01 ; and 2nd # in ZP reg 01
STZ 02 ; Zero-out ZP reg's 02, 03, and 04.
STZ 03
STZ 04
BEGIN
LDA 00 ; Get the 1st #.
WHILE_NOT_ZERO ; As long as there's something left in it,
AND #1 ; see if it's odd.
IF_NOT_ZERO ; If it is odd,
CLC
LDA 01
ADC 03 ; add the 2nd # low byte to ZP reg 03,
STA 03
LDA 02 ; and the 2nd # high byte to reg 04.
ADC 04 ; (The only way this high byte gets anything
STA 04 ; is if it gets rotated into it below.)
THEN
LSR 00 ; Halve the 1st # and truncate the fractional part.
ASL 01 ; Double the 2nd # low byte,
ROL 02 ; and high byte with carry.
REPEAT
RTS
;----------------
Re: Daud Zoss' unusual multiplication routine
Posted: Sun Dec 23, 2012 4:54 am
by GARTHWILSON
I believe Klaus' EOR is used to avoid the CLC required before the ADC. When C is not set the ADC is performed, otherwise just a shift is performed. Saves 2 cycles per loop.
Code: Select all
loop: lsr $101,x ;test op1 2^0 to 2^7
bcs skip
adc $102,x ;result =+ op2
skip: asl $102,x ;op2 =* 2
Ah, of course.

Re: Daud Zoss' unusual multiplication routine
Posted: Sun Dec 23, 2012 11:05 am
by Klaus2m5
Actually this is a pretty standard way to do multiplication, except maybe the part about putting the inputs on the hardware stack.
True, since standards usually describe the most resource efficient way to achieve something.
The swiss guy that wrote the original um8 code snippet wanted to familiarize himself with the direct use of the stack. So it serves more a learning purpose than a practical one.
The Russian peasants' multiply code snippet you posted is the most practical code, as it caters for a 16 bit result and can easily be expanded to any operand and result size that one would need.
Re: Daud Zoss' unusual multiplication routine
Posted: Sun Dec 23, 2012 12:01 pm
by Tor
Thanks for the link, a good read. The method is not so strange really, it's just a refined version of what many people do when trying to multiply (or divide too) two numbers (particularly when not using paper): Make it easier by doubling one of the values and half the other, to see if those numbers are simpler to handle. The easiest example would be something like 'what's 20*40? Well, it's 10*80 so must be 800.'
-Tor
Re: Daud Zoss' unusual multiplication routine
Posted: Sun Dec 23, 2012 5:58 pm
by bogax
I like
WhiteFlame's better.
Code: Select all
;------------------------
; 8bit * 8bit = 16bit multiply
; Multiplies "num1" by "num2" and stores result in .A (low byte, also in .X) and .Y (high byte)
; uses extra zp var "num1Hi"
; .X and .Y get clobbered. Change the tax/txa and tay/tya to stack or zp storage if this is an issue.
; idea to store 16-bit accumulator in .X and .Y instead of zp from bogax
; In this version, both inputs must be unsigned
; Remove the noted line to turn this into a 16bit(either) * 8bit(unsigned) = 16bit multiply.
lda #$00
tay
sty num1Hi ; remove this line for 16*8=16bit multiply
beq enterLoop
doAdd:
clc
adc num1
tax
tya
adc num1Hi
tay
txa
loop:
asl num1
rol num1Hi
enterLoop: ; accumulating multiply entry point (enter with .A=lo, .Y=hi)
lsr num2
bcs doAdd
bne loop
; 26 bytes
Re: Daud Zoss' unusual multiplication routine
Posted: Thu Dec 27, 2012 2:34 pm
by White Flame
The OP's seems to be 8x8=8 bit, so my
original one is even more applicable:
Code: Select all
; General 8bit * 8bit = 8bit multiply
; Multiplies "num1" by "num2" and returns result in .A
; by White Flame (aka David Holz) 20030207
; Input variables:
; num1 (multiplicand)
; num2 (multiplier), should be small for speed
; Signedness should not matter
; .X and .Y are preserved
; num1 and num2 get clobbered
; Instead of using a bit counter, this routine ends when num2 reaches zero, thus saving iterations.
lda #$00
beq enterLoop
doAdd:
clc
adc num1
loop:
asl num1
enterLoop: ;For an accumulating multiply (.A = .A + num1*num2), set up num1 and num2, then enter here
lsr num2
bcs doAdd
bne loop
end:
; 15 bytes
However, that EOR trick is pretty cool. I'll use that in the future.
Re: Daud Zoss' unusual multiplication routine
Posted: Mon Mar 25, 2013 7:43 pm
by BigEd
I've just read that early Wang calculators performed multiplication and division by computing logs and antilogs - an electronic slide rule! And yet they offered 10 displayed digits of precision. The earlier models didn't round results, which led to potentially confusing near-misses:
"An example would be multiplying 6 by 8. The LOCI-2 would give an answer of 47.99999999"
(
http://www.oldcalculatormuseum.com/wang360.html)
For a description of how a few repeated adds and subtracts can compute a logarithm or antilogarithm, see the patent (granted 1968):
http://ed-thelen.org/comp-hist/Wang-patent.html
Cheers
Ed
Re: Daud Zoss' unusual multiplication routine
Posted: Tue Mar 26, 2013 8:14 pm
by BigDumbDinosaur
I've just read that early Wang calculators performed multiplication and division by computing logs and antilogs - an electronic slide rule! And yet they offered 10 displayed digits of precision. The earlier models didn't round results, which led to potentially confusing near-misses:
"An example would be multiplying 6 by 8. The LOCI-2 would give an answer of 47,999 999.99"
(
http://www.oldcalculatormuseum.com/wang360.html)
For a description of how a few repeated adds and subtracts can compute a logarithm or antilogarithm, see the patent (granted 1968):
http://ed-thelen.org/comp-hist/Wang-patent.html
Cheers
Ed
Wang also used log and anti-log multiplication and division in their MVP BASIC. MVP BASIC also worked to 10 significant digits, a curious limitation, given that the MVP series of minicomputers was targeted to businesses, where that many significant digits was probably too small. Most MAI-compatible Business BASIC implementations (e.g., Thoroughbred, BBx, Uni-Basic, etc.) offer up to 16 significant digits.