Daud Zoss' unusual multiplication routine

Programming the 6502 microprocessor and its relatives in assembly and other languages.
User avatar
BigEd
Posts: 11464
Joined: 11 Dec 2008
Location: England
Contact:

Daud Zoss' unusual multiplication routine

Post by BigEd »

This multiplication routine takes constant time, and uses no memory other than the stack. More detail here.
Quote:
Let me begin by saying that this is not the correct way to do multiplication on the 6502

Code: Select all

; unsigned multiply by Daud Zoss
; as seen at http://swissembly.wordpress.com/2012/12/15/8-bit-multiply-in-place-on-6502/
;
; parameters in X and Y
; result (low byte only) in X

um8:
 pha
 tya
 pha
 txa
 tsx
 ldy #$08
loop1:
 pha
 dex
 asl
 dey
 bne loop1
 lda $0109,x
 ldy #$08
loop2:
 asl
 bcs skip1
 pha
 lda #$00
 sta $0101,x 
 pla
skip1: 
 inx
 dey
 bne loop2
 ldx #$00
 ldy #$08
loop3:
 txa
 tsx
 clc
 adc $0101,x
 tax
 pla
 dey
 bne loop3
 pla
 tay
 pla
I've tested this on easy6502 and seems to work. I haven't got my head around it!

Cheers
Ed
Last edited by BigEd on Sun Dec 23, 2012 11:30 am, edited 1 time in total.
User avatar
Arlet
Posts: 2353
Joined: 16 Nov 2010
Location: Gouda, The Netherlands
Contact:

Re: Daud Zoss' unusual multiplication routine

Post by Arlet »

It's made from 3 loops. The first loop writes 8 values to the stack: x, 2*x, 4*x, 8*x, ... 128*x. The second loop checks 'y' for bits that are cleared. For each cleared bit, it zeroes out the corresponding 2^n*x value in the table. In the 3rd loop, the 8 values in the table are summed.
User avatar
BigEd
Posts: 11464
Joined: 11 Dec 2008
Location: England
Contact:

Re: Daud Zoss' unusual multiplication routine

Post by BigEd »

Thanks for the explanation!
Klaus2m5
Posts: 442
Joined: 28 Jul 2012
Location: Wiesbaden, Germany

Re: Daud Zoss' unusual multiplication routine

Post by Klaus2m5 »

Inspired by above code. Less code bytes, less stack usage, less cycles = less everything.

Code: Select all

; unsigned multiply using stack as operand storage - 6502.org style
;
; parameters in X and Y
; result (low byte only) in X

um8:    pha         ;save a, y is not altered

        tya         ;move operands to stack
        pha
        txa
        eor #$ff    ;invert carry of op1
        pha
        lda #0      ;init result
        tsx         ;get index to operands
loop:   lsr $101,x  ;test op1 2^0 to 2^7
        bcs skip
        adc $102,x  ;result =+ op2
skip:   asl $102,x  ;op2 =* 2
        bne loop    ;until nothing more to add
        tax         ;move result to x
        pla         ;discard operands
        pla
        
        pla         ;restore a
        rts
6502 sources on GitHub: https://github.com/Klaus2m5
User avatar
GARTHWILSON
Forum Moderator
Posts: 8775
Joined: 30 Aug 2002
Location: Southern California
Contact:

Re: Daud Zoss' unusual multiplication routine

Post by GARTHWILSON »

Am I missing something, or could you omit the EOR #$FF and change the BCS to BCC like the following? (I like to apply the structure macros to make it a little more clear :)

Code: Select all

; unsigned multiply using stack as operand storage - 6502.org style
;
; parameters in X and Y
; result (low byte only) in X

um8:  PHA                  ; Save A.  Y is not altered.  A is saved
                           ; only in case the calling routine needed it.
      PHY                  ; Move operands to stack.
      PHX                  ; Now SP+2 has one multiplicand and SP+1 has the other.
      LDA  #0              ; Init result.
      TSX                  ; Get index to operands.
      BEGIN
          LSR  $101,X      ; Test op1 2^0 to 2^7.
          IF_CARRY_SET
              ADC  $102,X  ; Result =+ op2
          END_IF
          ASL  $102,X      ; op2 =* 2
      UNTIL_ZERO           ; until nothing more to add.
      TAX                  ; Move result to X.
      PLA                  ; Discard operands.
      PLA

      PLA                  ; Restore A.
      RTS
 ;----------------
Actually this is a pretty standard way to do multiplication, except maybe the part about putting the inputs on the hardware stack.
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?
User avatar
GARTHWILSON
Forum Moderator
Posts: 8775
Joined: 30 Aug 2002
Location: Southern California
Contact:

Re: Daud Zoss' unusual multiplication routine

Post by GARTHWILSON »

BigEd wrote:
This multiplication routine takes constant time, and uses no memory other than the stack. More detail here.
Quote:
Let me begin by saying that this is not the correct way to do multiplication on the 6502
It is definitely a very curious way to do it!

Here's my processing of the original, using FOR_Y...NEXT_Y which I'll be adding to the structure-macro pages. I reduced it some by using CMOS instructions also. One series that was especially more efficient was the PHA LDA#0 STA addr,X PLA (ie, four instructions) being replaced by just STZ addr,X (a single instruction).

Code: Select all

; unsigned multiply by Daud Zoss
; as seen at http://swissembly.wordpress.com/2012/12/15/8-bit-multiply-in-place-on-6502/
;
; parameters in X and Y
; result (low byte only) in X

um8:   PHA
       PHY
       TXA
       TSX
       FOR_Y  8,  DOWN_TO,  0
           PHA
           DEX
           ASL
           DEY
       NEXT_Y

       LDA  $109,X
       FOR_Y  8,  DOWN_TO,  0
           ASL
           IF_CARRY_CLR
               STZ  $101,X
           END_IF
           INX
           DEY
       NEXT_Y

       LDX #0
       FOR_Y  8, DOWN_TO,  0
           TXA
           TSX
           CLC
           ADC  $101,X
           TAX
           PLA
       NEXT_Y
       PLY
       PLA
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?
User avatar
MichaelM
Posts: 761
Joined: 23 Apr 2012
Location: Huntsville, AL

Re: Daud Zoss' unusual multiplication routine

Post by MichaelM »

I believe Klaus' EOR is used to avoid the CLC required before the ADC. When C is not set the ADC is performed, otherwise just a shift is performed. Saves 2 cycles per loop.

Code: Select all

loop:   lsr $101,x  ;test op1 2^0 to 2^7
        bcs skip
        adc $102,x  ;result =+ op2
skip:   asl $102,x  ;op2 =* 2
Michael A.
User avatar
GARTHWILSON
Forum Moderator
Posts: 8775
Joined: 30 Aug 2002
Location: Southern California
Contact:

Re: Daud Zoss' unusual multiplication routine

Post by GARTHWILSON »

Here's another way to multiply, and a very strange way, the Russian peasants' multiplication (see http://mathforum.org/dr.math/faq/faq.peasant.html). I have not tested it.

Code: Select all

     ORG  $0300

     LDA  $0340
     STA  00          ; Put 1st # in ZP reg 00

     LDA  $0341
     STA  01          ; and 2nd # in ZP reg 01

     STZ  02          ; Zero-out ZP reg's 02, 03, and 04.
     STZ  03
     STZ  04

     BEGIN
        LDA  00       ; Get the 1st #.
     WHILE_NOT_ZERO   ; As long as there's something left in it,
        AND  #1       ; see if it's odd.
        IF_NOT_ZERO   ; If it is odd,
           CLC
           LDA  01
           ADC  03    ; add the 2nd # low byte to ZP reg 03,
           STA  03

           LDA  02    ; and the 2nd # high byte to reg 04.
           ADC  04    ; (The only way this high byte gets anything
           STA  04    ; is if it gets rotated into it below.)
        THEN

        LSR  00       ; Halve  the 1st # and truncate the fractional part.
        ASL  01       ; Double the 2nd # low byte,
        ROL  02       ; and high byte with carry.
     REPEAT

     RTS
 ;----------------
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?
User avatar
GARTHWILSON
Forum Moderator
Posts: 8775
Joined: 30 Aug 2002
Location: Southern California
Contact:

Re: Daud Zoss' unusual multiplication routine

Post by GARTHWILSON »

MichaelM wrote:
I believe Klaus' EOR is used to avoid the CLC required before the ADC. When C is not set the ADC is performed, otherwise just a shift is performed. Saves 2 cycles per loop.

Code: Select all

loop:   lsr $101,x  ;test op1 2^0 to 2^7
        bcs skip
        adc $102,x  ;result =+ op2
skip:   asl $102,x  ;op2 =* 2
Ah, of course. :oops:
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?
Klaus2m5
Posts: 442
Joined: 28 Jul 2012
Location: Wiesbaden, Germany

Re: Daud Zoss' unusual multiplication routine

Post by Klaus2m5 »

GARTHWILSON wrote:
Actually this is a pretty standard way to do multiplication, except maybe the part about putting the inputs on the hardware stack.
True, since standards usually describe the most resource efficient way to achieve something.

The swiss guy that wrote the original um8 code snippet wanted to familiarize himself with the direct use of the stack. So it serves more a learning purpose than a practical one.

The Russian peasants' multiply code snippet you posted is the most practical code, as it caters for a 16 bit result and can easily be expanded to any operand and result size that one would need.
6502 sources on GitHub: https://github.com/Klaus2m5
Tor
Posts: 597
Joined: 10 Apr 2011
Location: Norway/Japan

Re: Daud Zoss' unusual multiplication routine

Post by Tor »

GARTHWILSON wrote:
Here's another way to multiply, and a very strange way, the Russian peasants' multiplication (see http://mathforum.org/dr.math/faq/faq.peasant.html).
Thanks for the link, a good read. The method is not so strange really, it's just a refined version of what many people do when trying to multiply (or divide too) two numbers (particularly when not using paper): Make it easier by doubling one of the values and half the other, to see if those numbers are simpler to handle. The easiest example would be something like 'what's 20*40? Well, it's 10*80 so must be 800.'

-Tor
bogax
Posts: 250
Joined: 18 Nov 2003

Re: Daud Zoss' unusual multiplication routine

Post by bogax »

I like WhiteFlame's better.

Code: Select all

;------------------------
; 8bit * 8bit = 16bit multiply
; Multiplies "num1" by "num2" and stores result in .A (low byte, also in .X) and .Y (high byte)
; uses extra zp var "num1Hi"

; .X and .Y get clobbered.  Change the tax/txa and tay/tya to stack or zp storage if this is an issue.
;  idea to store 16-bit accumulator in .X and .Y instead of zp from bogax

; In this version, both inputs must be unsigned
; Remove the noted line to turn this into a 16bit(either) * 8bit(unsigned) = 16bit multiply.

 lda #$00
 tay
 sty num1Hi  ; remove this line for 16*8=16bit multiply
 beq enterLoop

doAdd:
 clc
 adc num1
 tax

 tya
 adc num1Hi
 tay
 txa

loop:
 asl num1
 rol num1Hi
enterLoop:  ; accumulating multiply entry point (enter with .A=lo, .Y=hi)
 lsr num2
 bcs doAdd
 bne loop

; 26 bytes
White Flame
Posts: 704
Joined: 24 Jul 2012

Re: Daud Zoss' unusual multiplication routine

Post by White Flame »

The OP's seems to be 8x8=8 bit, so my original one is even more applicable:

Code: Select all

; General 8bit * 8bit = 8bit multiply
; Multiplies "num1" by "num2" and returns result in .A

; by White Flame (aka David Holz) 20030207

; Input variables:
;   num1 (multiplicand)
;   num2 (multiplier), should be small for speed
;   Signedness should not matter

; .X and .Y are preserved
; num1 and num2 get clobbered

; Instead of using a bit counter, this routine ends when num2 reaches zero, thus saving iterations.

 lda #$00
 beq enterLoop

doAdd:
 clc
 adc num1

loop:
 asl num1
enterLoop: ;For an accumulating multiply (.A = .A + num1*num2), set up num1 and num2, then enter here
 lsr num2
 bcs doAdd
 bne loop

end:

; 15 bytes
However, that EOR trick is pretty cool. I'll use that in the future.
User avatar
BigEd
Posts: 11464
Joined: 11 Dec 2008
Location: England
Contact:

Re: Daud Zoss' unusual multiplication routine

Post by BigEd »

I've just read that early Wang calculators performed multiplication and division by computing logs and antilogs - an electronic slide rule! And yet they offered 10 displayed digits of precision. The earlier models didn't round results, which led to potentially confusing near-misses:
"An example would be multiplying 6 by 8. The LOCI-2 would give an answer of 47.99999999"
(http://www.oldcalculatormuseum.com/wang360.html)

For a description of how a few repeated adds and subtracts can compute a logarithm or antilogarithm, see the patent (granted 1968):
http://ed-thelen.org/comp-hist/Wang-patent.html

Cheers
Ed
User avatar
BigDumbDinosaur
Posts: 9428
Joined: 28 May 2009
Location: Midwestern USA (JB Pritzker’s dystopia)
Contact:

Re: Daud Zoss' unusual multiplication routine

Post by BigDumbDinosaur »

BigEd wrote:
I've just read that early Wang calculators performed multiplication and division by computing logs and antilogs - an electronic slide rule! And yet they offered 10 displayed digits of precision. The earlier models didn't round results, which led to potentially confusing near-misses:
"An example would be multiplying 6 by 8. The LOCI-2 would give an answer of 47,999 999.99"
(http://www.oldcalculatormuseum.com/wang360.html)

For a description of how a few repeated adds and subtracts can compute a logarithm or antilogarithm, see the patent (granted 1968):
http://ed-thelen.org/comp-hist/Wang-patent.html

Cheers
Ed

Wang also used log and anti-log multiplication and division in their MVP BASIC. MVP BASIC also worked to 10 significant digits, a curious limitation, given that the MVP series of minicomputers was targeted to businesses, where that many significant digits was probably too small. Most MAI-compatible Business BASIC implementations (e.g., Thoroughbred, BBx, Uni-Basic, etc.) offer up to 16 significant digits.
x86?  We ain't got no x86.  We don't NEED no stinking x86!
Post Reply