6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Sat Apr 27, 2024 7:02 am

All times are UTC




Post new topic Reply to topic  [ 65 posts ]  Go to page Previous  1, 2, 3, 4, 5  Next
Author Message
PostPosted: Tue Mar 17, 2015 4:17 am 
Offline
User avatar

Joined: Sun Jun 30, 2013 10:26 pm
Posts: 1926
Location: Sacramento, CA, USA
That's a nice improvement from your original UD/MOD attempt. Have you been able to measure the speed difference yet?

I know that you are going for speed as well as compactness, and I can't figure out how to beat Garth's improved FIG UM/MOD for speed, but if I was going for smallest size, I would probably reuse code for UD/MOD and UM/MOD like so (58 code bytes total, headers not included):
Code:
; - - - - - - - - - - - - - - - - - - - - - - - - - - -
; UM/MOD  ( ud u1 -- u2 u3 )
; Dividend ud is 32-bits unsigned.  Divisor u1,
; Remainder u2 and Quotient u3 are 16-bits unsigned
; Invalid inputs and/or outputs are silently ignored
; 1823 cycles (+ NEXT) best case (zero quotient)
; 2239 cycles (+ NEXT) worst case (zero divisor)
;
umslashmod:
        jsr  xudivmod   ; (do the dirty work)
        inx             ; NIP
        jmp  swap       ; SWAP

; - - - - - - - - - - - - - - - - - - - - - - - - - - -
; UD/MOD  ( ud1 u1 -- u2 ud2 )
; Dividend ud1 and Quotient ud2 are 32-bits unsigned
; Divisor u1 and Remainder u2 are 16-bits unsigned
; Invalid inputs and/or outputs are silently ignored
; 1856 cycles (+ NEXT) best case (zero quotient)
; 2272 cycles (+ NEXT) worst case (zero divisor)
;
udslashmod:
        jsr  xudivmod   ; (do the dirty work)
        jmp  dashrot    ; -ROT

; - - - - - - - - - - - - - - - - - - - - - - - - - - -
; Internal divmod routine used by UM/MOD and UD/MOD
; ( ud1 u1 -- ud2 u2 )
; ud1 is dividend, u1 is divisor
; ud2 is quotient, u2 is remainder
xudivmod:
        ldy  #32        ;[2] init loop counter
        lda  #0         ;[2] init partial remainder
        sta  N          ;[3]   in N:A (h:l)
xudm2:
        asl  stackl+1,x ;[6] dividend in NOS:3OS (h:l)
        rol  stackh+1,x ;[6]   is gradually replaced
        rol  stackl,x   ;[6]   with the quotient
        rol  stackh,x   ;[6]
        rol             ;[2] N:A is gradually replaced
        rol  N          ;[5]   with the remainder
        pha             ;[3]
        cmp  TOS        ;[3] TOS holds divisor
        lda  N          ;[3] partial remainder >= TOS?
        sbc  TOS+1      ;[3]
        bcc  xudm3      ;[3]*
        sta  N          ;[3]   yes: update the partial
        pla             ;[4]     remainder and set the
        sbc  TOS        ;[3]     low bit in the partial
        inc  stackl+1,x ;[6]     quotient
        .db  $c9        ;[2]*  cmp# naked opcode
xudm3:
        pla             ;[4]*
        dey             ;[2] loop 32 times
        bne  xudm2      ;[3]*
        sta  TOS        ;[3]
        ldy  N          ;[3]
        sty  TOS+1      ;[3]
        rts             ;[6]

I'm not going to win many speed contests that way, especially with UM/MOD, but I'm 51% confident that this way is going to have the smallest footprint. It's certainly not the smallest theoretically, because you could do repeated subtractions without shifts, but I think that my proposal gives good "bang for the byte", assuming you want to have UM/MOD and UD/MOD available to the user, with reasonable performance.

YMMV, untested, as usual.

Mike B.


Last edited by barrym95838 on Tue Mar 17, 2015 5:22 am, edited 1 time in total.

Top
 Profile  
Reply with quote  
PostPosted: Tue Mar 17, 2015 5:18 am 
Offline
User avatar

Joined: Fri Aug 30, 2002 1:09 am
Posts: 8428
Location: Southern California
UD/MOD doesn't usually need to be super fast; but if it did, I would look for an efficient, dedicated way to divide by ten (ie, $0A), then test the input and route the divide-by-10 to that, route the divide-by-two and divide-by-16 to dedicated parts that just shift, and anything else to the normal routine (which will seldom get used).

_________________
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?


Top
 Profile  
Reply with quote  
PostPosted: Tue Mar 17, 2015 6:40 am 
Offline

Joined: Tue Jul 24, 2012 2:27 am
Posts: 672
All you need is to go Forth and multiply.

(I'll see myself out...)

_________________
WFDis Interactive 6502 Disassembler
AcheronVM: A Reconfigurable 16-bit Virtual CPU for the 6502 Microprocessor


Top
 Profile  
Reply with quote  
PostPosted: Tue Mar 17, 2015 9:15 am 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10793
Location: England
GARTHWILSON wrote:
UD/MOD doesn't usually need to be super fast; but if it did, I would look for an efficient, dedicated way to divide by ten (ie, $0A), then test the input and route the divide-by-10 to that, route the divide-by-two and divide-by-16 to dedicated parts that just shift, and anything else to the normal routine (which will seldom get used).

Nice idea - make the common case fast.


Top
 Profile  
Reply with quote  
PostPosted: Wed Mar 18, 2015 12:31 pm 
Offline

Joined: Thu Mar 03, 2011 5:56 pm
Posts: 277
White Flame wrote:
All you need is to go Forth and multiply.

(I'll see myself out...)


Not an entirely new idea: http://www.jupiter-ace.co.uk/Forth_PCWGoForthAnd.html


Top
 Profile  
Reply with quote  
PostPosted: Thu Mar 19, 2015 2:27 am 
Offline

Joined: Sat Aug 21, 2010 7:52 am
Posts: 231
Location: Arlington VA
There are some great ideas here, and to do them justice, I'm going to have to dedicate some time to implementing and benchmarking them all within the same context. I never really did figure out how to do the reciprocal multiplication approach, and that division by constants technique with shifting and adding tables of constants, that's black magic to me as well. I'll publish a table of the results when I'm done.


Top
 Profile  
Reply with quote  
PostPosted: Sat Apr 04, 2015 6:58 pm 
Offline
User avatar

Joined: Sun Jun 30, 2013 10:26 pm
Posts: 1926
Location: Sacramento, CA, USA
barrym95838 wrote:
... and I can't figure out how to beat Garth's improved FIG UM/MOD for speed ...

... or can I??? 8) ... details to come, soon, I hope.

Mike B.


Top
 Profile  
Reply with quote  
PostPosted: Tue Apr 14, 2015 6:25 am 
Offline
User avatar

Joined: Sun Jun 30, 2013 10:26 pm
Posts: 1926
Location: Sacramento, CA, USA
Okay, I have an embarrassing confession ... I can't figure out how to use the IIgs monitor well enough to test this 16-bit UM/MOD ... I'm quite proficient with the original Apple monitor, but success in working the enhanced version somehow eludes me. Can anyone comment on this brutally raw attempt, and possibly give me a thumbs up or down? Garth is buried in legitimate work, and he's the only one I know with an '802/'816 Forth.
Code:
;------------------------------------------------------
; UM/MOD for the 65802/816 by Michael T. Barry.
; Free to copy and/or modify, but without warranty.
;
; ( ud u1 -- u2 u3 )
;
; ud is the 32-bit dividend.
; u1 is the 16-bit divisor.
; u2 is the 16-bit remainder.
; u3 is the 16-bit quotient.
; All values are unsigned.
;
; Quotient overflow or /0 results in a quotient and
;   remainder of $FFFF.  Although $FFFF could be a
;   valid quotient, it cannot be a valid remainder,
;   so the caller may use this fact to check for an
;   invalid result.
;
; 0,x is TOS ( divisor -- quotient )
; 2,x is NOS ( hi dividend -- remainder )
; 4,x is 3OS ( lo dividend --  )
;
; Accumulator and memory are in 16-bit mode.
; Index registers are in 8-bit mode.
;
HEADER "UM/MOD", NOT_IMMEDIATE ; ( ud u -- rem quot )
        CODE            ; Machine code primitive.
umslashmod:
        lda  2,x        ; If hi cell of dividend is
        cmp  0,x        ;   >= divisor, make a note
        php             ;   of overflow for later.
        ldy  #17        ; Init loop counter.
        .db  $c0        ; Skip following rol.
umsm2:
        rol             ; Shift hi cell of dividend.
; !!!!!!!!! Bug !!!!!!!!! 17th bit is lost here !!!!!!!!!
        cmp  0,x        ; if (hi dividend >= divisor)
        bcc  umsm3      ; then update remainder
        sbc  0,x        ;   and leave carry set
umsm3:
        rol  4,x        ; Shift carry into quotient.
        dey             ; Loop until done.
        bne  umsm2
        plp             ; Check for overflow.
        bcc  umsm5      ;
        tya             ; If so, put $ffff in
        dec             ;   quotient and remainder.
        sta  4,x        ;
umsm5:                  ;
        sta  2,x        ; Save remainder in NOS.
        inx             ;
        inx             ; DROP
        jmp  swap       ; SWAP ;
; 34 bytes, not counting HEADER.

If I made any glaring errors, I will correct them here, with appropriate credit.

Mike B.

[Edit: I've been able to play with it just a bit, and the quotient doesn't seem to be correct for any divisor other than 1 ... that could be considered a significant bug! Work continues ...]

[Edit #2: I have substituted a second (but still not tested) attempt above, completely replacing the first, which was s**t. It was inspired by https://raw.githubusercontent.com/Acher ... muldiv.asm
, but please don't hold that against White Flame if this attempt turns out to be s**t as well.]

[Edit: This version is buggy for larger inputs. See below for a properly debugged version.]


Last edited by barrym95838 on Tue May 19, 2015 7:18 am, edited 2 times in total.

Top
 Profile  
Reply with quote  
PostPosted: Mon Apr 20, 2015 12:47 am 
Offline
User avatar

Joined: Sun Jun 30, 2013 10:26 pm
Posts: 1926
Location: Sacramento, CA, USA
Okay, I have no solid excuse for not testing this one, only that I've been a bit under the weather for the past couple of days, and I got excited when I finally got it ironed out and it "looked" right:
Code:
;------------------------------------------------------
; UM/MOD for any 6502 by Michael T. Barry.
; Free to copy and/or modify, but without warranty.
;
; ( ud u1 -- u2 u3 )
; ud is the 32-bit dividend.
; u1 is the 16-bit divisor.
; u2 is the 16-bit remainder.
; u3 is the 16-bit quotient.
; All values are unsigned.
;
; Quotient overflow or /0 results in a quotient and
;   remainder of $FFFF.  Although $FFFF could be a
;   valid quotient, it cannot be a valid remainder,
;   so the caller may use this fact to check for an
;   invalid result.
;
; TOS and TOS+1 are at fixed points in ZP.
; stackl,x and stackh,x are NOS.
; stackl+1,x and stackh+1,x are 3OS.
;
umslashmod:
        ldy  #17        ;[2] Init loop counter
        lda  stackl,x   ;[4]
        cmp  TOS        ;[3] If (dividend:H < divisor)
        lda  stackh,x   ;[4]
        sbc  TOS+1      ;[3] then
        lda  stackh,x   ;[4]   keep NOS:H in A for speed
        bcc  umsm4      ;[3]*  and proceed with division
        lda  #$ff       ;[2] else
        sta  stackl+1,x ;[4]   overflow: Put $ffff in
        sta  stackh+1,x ;[4]   quotient and remainder
        sta  stackl,x   ;[4]
umsm2:                  ;
        sta  stackh,x   ;[4] Exit sequence:
        jsr  slide      ;[28]  DROP the divisor
        jmp  SWAP       ;[34]* SWAP quotient and remainder
umsm3:                  ;
        sbc  TOS        ;[3]
        sta  stackl,x   ;[4] Update remainder
        pla             ;[4]  (gradually replacing
        sbc  TOS+1      ;[3]   dividend:H)
        sec             ;[2]
umsm4:                  ;
        rol  stackl+1,x ;[6] Shift carry into quotient
        rol  stackh+1,x ;[6]  (gradually replacing
        dey             ;[2]   dividend:L)
        beq  umsm2      ;[2]* Exit if done
        rol  stackl,x   ;[6] Shift dividend:H
        rol             ;[2]
        pha             ;[3]
umsm5:                  ;
        lda  stackl,x   ;[4]
        bcs  umsm3      ;[2]*
        cmp  TOS        ;[3] If (17th bit set) or
        pla             ;[4]   (dividend:H >= divisor)
        pha             ;[3]
        sbc  TOS+1      ;[3] then
        bcs  umsm5      ;[2]*  update remainder
        pla             ;[4] else
        bcc  umsm4      ;[3]   continue without update
;------------------- 64 bytes, not counting HEADER

Comments, corrections, critiques, or congratulations are welcome, as always.

Mike B.

[Edit: Corrected 17th-bit bug.]
[Edit #2: 17th-bit needs to be checked before the compare (a subtle but nasty little bug) ... corrected the correction.]


Last edited by barrym95838 on Mon Aug 03, 2015 3:39 am, edited 5 times in total.

Top
 Profile  
Reply with quote  
PostPosted: Mon Apr 20, 2015 5:54 am 
Offline
User avatar

Joined: Fri Dec 11, 2009 3:50 pm
Posts: 3346
Location: Ontario, Canada
It looks right to me, Mike -- and a tidy piece of work it is! There is a way you could cut about 90 cycles off the execution time, but it adds about 30 bytes, and I have a feeling it's the memory economy that's your priority.

FWIW:
  • Right now you have a loop that iterates 16 times and contains a 4-byte (32-bit) shift.
    But only 3 bytes are actually used each iteration. One byte isn't read or written but still gets shifted. That can be avoided.

  • The faster alternative is a loop that iterates 8 times and contains a 3-byte shift,
    followed by another loop that iterates 8 times and contains a slightly different 3-byte shift.


Attachment:
UStar animation (c)Jeff Laughton.gif
UStar animation (c)Jeff Laughton.gif [ 147.78 KiB | Viewed 6043 times ]
To see what I mean, compare the middle and lower sections in the animation above. The middle section illustrates sixteen 4-byte shifts, and the bottom section illustrates eight 3-byte shifts followed by eight slightly different 3-byte shifts. The animation illustrates a few multiplication routines, discussed here, but the same shifting shortcut can be used for division. Were your division routine rewritten to exploit this, the first loop could omit shifting stackl+1,X and the second could omit shifting stackh+1,X.

cheers,
Jeff

ps: "TOS and TOS+1 are at fixed points in ZP." TOS resides someplace other than on stack??? I guess this relates to the split stack you're using. TOS is treated differently than the other stack items so it can be used as an indirect pointer -- is that right?

[Edit: add illustration; modify text]

_________________
In 1988 my 65C02 got six new registers and 44 new full-speed instructions!
https://laughtonelectronics.com/Arcana/ ... mmary.html


Last edited by Dr Jefyll on Thu Jan 21, 2016 11:47 pm, edited 3 times in total.

Top
 Profile  
Reply with quote  
PostPosted: Mon Apr 20, 2015 8:06 pm 
Offline
User avatar

Joined: Sun Jun 30, 2013 10:26 pm
Posts: 1926
Location: Sacramento, CA, USA
Dr Jefyll wrote:
... "TOS and TOS+1 are at fixed points in ZP." TOS resides someplace other than on stack??? I guess this relates to the split stack you're using. TOS is treated differently than the other stack items so it can be used as an indirect pointer -- is that right?

That's Charlie's deal with his Pettil implementation. It involves some trade-offs. A favorable example: NIP becomes INX. There are pros and cons to any decision like that. For my 65m32 DTC Forth, I keep TOS in the accumulator, but that would be impossible for a 6502, and a dubious optimization for an '802/'816. Perhaps Charlie did some benchmarks that led to his design decision, and would be able to share them briefly with us, here or in a fresh thread?

Mike B.

[Addendum: Here's a tidied-up version for the '802/'816 ... same size, but slightly faster (?!), on average:
Code:
;------------------------------------------------------
; UM/MOD for the 65802/816 by Michael T. Barry.
; Free to copy and/or modify, but without warranty.
;
; ( ud u1 -- u2 u3 )
;
; ud is the 32-bit dividend.
; u1 is the 16-bit divisor.
; u2 is the 16-bit remainder.
; u3 is the 16-bit quotient.
; All values are unsigned.
;
; Quotient overflow or /0 results in a quotient and
;   remainder of $FFFF.  Although $FFFF could be a
;   valid quotient, it cannot be a valid remainder,
;   so the caller may use this fact to check for an
;   invalid result.
;
; 0,x is TOS ( divisor -- quotient )
; 2,x is NOS ( hi dividend -- remainder )
; 4,x is 3OS ( lo dividend --  )
;
; Accumulator and memory are in 16-bit mode.
; Index registers are in 8-bit mode.
;
HEADER "UM/MOD", NOT_IMMEDIATE ; ( ud u -- rem quot )
        CODE            ; Machine code primitive.
umslashmod:
        ldy  #17        ; Init loop counter.
        lda  2,x        ; If (hi dividend >= divisor)
        cmp  0,x        ;   then we have an overflow
        bcc  umsm4      ;   condition, so we exit
        lda  #$ffff     ;   early with $ffff in the
        sta  4,x        ;   quotient and remainder.
umsm2:                  ;
        sta  2,x        ; Save remainder in NOS.
        inx             ;
        inx             ; DROP
        jmp  swap       ; SWAP ;
umsm3:
        sbc  0,x        ; Update remainder
        sec
umsm4:
        rol  4,x        ; Shift carry into quotient.
        dey             ; Loop until done.
        beq  umsm2
        rol             ; Shift hi cell of dividend.
        bcs  umsm3
        cmp  0,x        ; If (hi dividend >= divisor)
        bcs  umsm3      ; then update remainder
        bcc  umsm4      ; else continue
;------------------- 37 bytes, not counting HEADER.
;Inspired by https://raw.githubusercontent.com/AcheronVM/acheronvm/master/src/ops-muldiv.asm

]
[Edit: Corrected 17th-bit bug.]


Last edited by barrym95838 on Fri Jul 31, 2015 6:20 am, edited 3 times in total.

Top
 Profile  
Reply with quote  
PostPosted: Sun May 03, 2015 9:37 pm 
Offline
User avatar

Joined: Sun Jun 30, 2013 10:26 pm
Posts: 1926
Location: Sacramento, CA, USA
Garth found some time to check my routine, and found some strange errors with larger inputs. I believe that this is the bug report for the '802/'816 version, but he wasn't 100% clear on that, and I'm not sure if both of my attempts share the same bug(s):
Code:
12345678 / 60000 says        64 rem 51534      This one in decimal.
                 should be  205 rem 45678      All others below are in hex.

12345678 / ABCD  says        42 rem  B9E
                 should be 1B20 rem 3DD8

 2345678 /   "   says        42 rem  B9E
                 should be  348 rem 9DD0

  345678 /   "   says        42 rem  B9E  (wrong again)
  123456 /   "   says         "  "    "      "     "

    1234 /   "   says         0 rem 1234   correct
    ABCD /   "   says         1 rem    0     "
    ABCE /   "   says         1 rem    1     "
   1ABCE /   "   says         2 rem 5434     "
  11ABCE /   "   says         2 rem 5434   wrong
                 should be   1A rem 38FC

   10000 / 1000  says        10 rem    0   correct
   20000 /   "   says        20 rem    0     "
  200000 /   "   says       200 rem    0     "
  200000 / 2000  says       100 rem    0     "
     "   / 4000  says        80 rem    0     "
     "   / 8000  says        40 rem    0     "
     "   / ABCD  says         0 rem    0   wrong
     "   / 8BCD  says         " rem    "     "
     "   / A000  says         " rem    "     "     Adding just the one bit (bit 13) in the divisor made it go wrong.  (It was ok with 8000.)

Does anyone feel like showing me the error of my ways? I seem to have lost the spark of inspiration that prompted me to post the attempts, and I don't know if I'll be able to find the bug soon enough to avoid more embarrassment.

Mike B.

[Edit: Since my quotient seems to wind up too low in the error cases, I am probably missing a 17th bit in my compare, and need to place a BCS umsm3 just after my ROL and an SEC just after my SBC 0,X. I'll give that a try and report back when I can. This potential fix would increase the '02 version to 64 bytes and the '802/'816 version to 37 bytes, which isn't bad, as long as it fixes the bug.]


Top
 Profile  
Reply with quote  
PostPosted: Tue May 19, 2015 7:15 am 
Offline
User avatar

Joined: Sun Jun 30, 2013 10:26 pm
Posts: 1926
Location: Sacramento, CA, USA
I tried the mods mentioned in the edit directly above and verified a half-width (8-bit accumulator) exhaustive test in about an hour under AppleWin at maximum speed on my old PC. I think that I can extrapolate full-width (16-bit accumulator) success without waiting about 1900 years for the exhaustive test at full-width using the same setup. The code above is corrected.

Mike B.


Top
 Profile  
Reply with quote  
PostPosted: Mon Aug 03, 2015 6:12 am 
Offline
User avatar

Joined: Sun Jun 30, 2013 10:26 pm
Posts: 1926
Location: Sacramento, CA, USA
Here's a short NMOS version without error checking, in only 46 bytes, using a standard fig-style stack implementation!
Code:
;------------------------------------------------------
; UM/MOD for any 6502 by Michael T. Barry.
; Free to copy and/or modify, but without warranty.
;
; ( ud u1 -- u2 u3 )
; ud is the 32-bit dividend.
; u1 is the 16-bit divisor.
; u2 is the 16-bit remainder.
; u3 is the 16-bit quotient.
; All values are unsigned.
;
; Quotient overflow or /0 produces invalid results.
;
; 0,x and 1,x are TOS  ( divisor -- quotient )
; 2,x and 3,x are NOS  ( dividend:H -- remainder)
; 4,x and 5,x are 3OS  ( dividend:L --  )
;
umslashmod:
        ldy  #17        ;[2] Init loop counter
        clc             ;[2]
        bcc  umsm4      ;[3]
umsm2:                  ;
        rol  2,x        ;[6] Shift dividend:H
        rol  3,x        ;[6]
        bcs  umsm3      ;[2]* <<<<<<<<<<<<<<<<<<
        lda  2,x        ;[4]
        cmp  0,x        ;[4] If (17th bit clear) or
        lda  3,x        ;[4]   (dividend:H < divisor)
        sbc  1,x        ;[4] then
        bcc  umsm4      ;[2]*  skip remainder update
umsm3:                  ;
        lda  2,x        ;[4]
        sbc  0,x        ;[4]
        sta  2,x        ;[4] Update remainder
        lda  3,x        ;[4]  (gradually replacing
        sbc  1,x        ;[4]   dividend:H)
        sta  3,x        ;[4]
        sec             ;[2]
umsm4:                  ;
        rol  4,x        ;[6] Shift carry into quotient
        rol  5,x        ;[6]  (gradually replacing
        dey             ;[2]   dividend:L)
        bne  umsm2      ;[3]* Loop until done
        inx             ;[2] Exit sequence:
        inx             ;[2]   DROP the divisor
        jmp  SWAP       ;[34]* SWAP quotient and remainder
;------------------- 46 bytes, not counting HEADER

It passed my exhaustive half-width test as well, but still shouldn't be fully trusted until someone with a working 6502 Forth can try some of the test cases Garth detailed above.

Mike B.


Top
 Profile  
Reply with quote  
PostPosted: Fri Aug 07, 2015 5:08 am 
Offline
User avatar

Joined: Sun Jun 30, 2013 10:26 pm
Posts: 1926
Location: Sacramento, CA, USA
... and for you byte-squeezing freaks (you know who you are), I have 44-byte NMOS and 27-byte '802/'816 versions that pass the exhaustive half-width test as well:
Code:
;------------------------------------------------------
; UM/MOD for any 6502 by Michael T. Barry.
; Free to copy and/or modify, but without warranty.
                        ;
; ( ud u1 -- u2 u3 )
; ud is the 32-bit dividend.
; u1 is the 16-bit divisor.
; u2 is the 16-bit remainder.
; u3 is the 16-bit quotient.
; All values are unsigned.
                        ;
; Invalid results for quotient overflow or /0
                        ;
; 0,x and 1,x are TOS  ( divisor -- quotient )
; 2,x and 3,x are NOS  ( dividend:H -- remainder )
; 4,x and 5,x are 3OS  ( dividend:L --  )
                        ;
umslashmod:             ;
        ldy  #16        ; loop counter
umsm2:                  ;
        asl  4,x        ; dividend:L is gradually replaced
        rol  5,x        ;   with the quotient
        rol  2,x        ; dividend:H is gradually replaced
        rol  3,x        ;   with the remainder
        bcs  umsm3      ;
        lda  2,x        ;
        cmp  0,x        ; compare remainder to divisor
        lda  3,x        ;
        sbc  1,x        ;
        bcc  umsm4      ;
umsm3:                  ;
        lda  2,x        ;
        sbc  0,x        ; if (remainder >= divisor) then
        sta  2,x        ;   update partial remainder
        lda  3,x        ;
        sbc  1,x        ;
        sta  3,x        ;
        inc  4,x        ;   and set low bit in quotient
umsm4:                  ;
        dey             ;
        bne  umsm2      ; Loop until done
        inx             ; Exit sequence:
        inx             ;   DROP the divisor
        jmp  SWAP       ;   SWAP quotient and remainder
                        ; (SWAP's machine code, not its CFA)
;------------------- 44 bytes, not counting HEADER

;------------------------------------------------------
; UM/MOD for the 65802/816 by Michael T. Barry.
; Free to copy and/or modify, but without warranty.
                        ;
; ( ud u1 -- u2 u3 )
; ud is the 32-bit dividend.
; u1 is the 16-bit divisor.
; u2 is the 16-bit remainder.
; u3 is the 16-bit quotient.
; All values are unsigned.
                        ;
; Invalid results for quotient overflow or /0
; Accumulator and memory are in 16-bit mode
; Index registers are in 8-bit mode
                        ;
; 0,x is TOS  ( divisor -- quotient )
; 2,x is NOS  ( dividend:H -- remainder )
; 4,x is 3OS  ( dividend:L --  )
                        ;
umslashmod:             ;
        ldy  #16        ; loop counter
        lda  2,x        ;
umsm2:                  ;
        asl  4,x        ; dividend:L is gradually replaced
                        ;   with the quotient
        rol             ; dividend:H is gradually replaced
                        ;   with the remainder
        bcs  umsm3      ;
        cmp  0,x        ; compare remainder to divisor
        bcc  umsm4      ;
umsm3:                  ; if (remainder >= divisor) then
        sbc  0,x        ;   update partial remainder
        inc  4,x        ;   and set low bit in quotient
umsm4:                  ;
        dey             ;
        bne  umsm2      ; Loop until done
        sta  2,x        ;
        inx             ; Exit sequence:
        inx             ;   DROP the divisor
        jmp  SWAP       ;   SWAP quotient and remainder
                        ; (SWAP's machine code, not its CFA)
;------------------- 27 bytes, not counting HEADER


Mike B.

[Edit: 44 bytes, not 45!]


Last edited by barrym95838 on Tue Feb 09, 2016 4:50 am, edited 3 times in total.

Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 65 posts ]  Go to page Previous  1, 2, 3, 4, 5  Next

All times are UTC


Who is online

Users browsing this forum: No registered users and 5 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: