Dividing by seven

Programming the 6502 microprocessor and its relatives in assembly and other languages.
User avatar
BigEd
Posts: 11463
Joined: 11 Dec 2008
Location: England
Contact:

Dividing by seven

Post by BigEd »

Here's a collection of emails/posting all about writing a small and fast routine to divide by 7 - surely some enlightenment to be gained!

http://mirrors.apple2.org.za/ground.ica ... mming/div7
Quote:
I'm trying to write a pixel driver and need to divide the X coord by 7 to get the correct pixel byte. I thought about using a table, but somehow, having 140 or even 280 different values stored didn't seem appealing for addressing a meager 40 bytes.
Contributions by
Richard Jackson
Ranando King
Scott Hemphill
Paul Guertin
User avatar
GARTHWILSON
Forum Moderator
Posts: 8773
Joined: 30 Aug 2002
Location: Southern California
Contact:

Re: Dividing by seven

Post by GARTHWILSON »

This page has short routines for dividing by any number up to 32. I linked to it on my links page, but I have not tried them:
http://forums.nesdev.com/viewtopic.php?f=2&t=11336
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?
User avatar
BigEd
Posts: 11463
Joined: 11 Dec 2008
Location: England
Contact:

Re: Dividing by seven

Post by BigEd »

Thanks!
chitselb
Posts: 232
Joined: 21 Aug 2010
Location: Ontonagon MI
Contact:

Re: Dividing by seven

Post by chitselb »

"[Chuck Moore] speculated that most multiply/divide subroutines were written by someone who had never done one before and never would again" http://www.forth.com/resources/evolution/evolve_1.html

I need something similar, a 40/mod word for a Radix-50 and for screen positioning on a 40-column PET. Oh and it should run in less than 47 clock cycles and consume only 15 bytes of RAM. I'm thinking about basing "40/mod" ( u -- u%40 u/40 ) on a pair of words, "5*" ( u -- u*5 ) and "5/" ( u -- u/5 ) [sans quotes]. Calculating 1/5 in binary is the same as multiplying by .0011001100110011... which looks like a pattern, something that could be readily squeezed and looped to reduce size and clocks required.

Lately I haven't had the time/energy to devote to this. The nesdev routines are interesting but I couldn't figure out how he was doing it, and the 8-bit accuracy was a problem for me. There's a 16-bit "10/" somewhere on the thread that looks much better than what I came up with, also on that thread.
User avatar
GARTHWILSON
Forum Moderator
Posts: 8773
Joined: 30 Aug 2002
Location: Southern California
Contact:

Re: Dividing by seven

Post by GARTHWILSON »

chitselb wrote:
"[Chuck Moore] speculated that most multiply/divide subroutines were written by someone who had never done one before and never would again" http://www.forth.com/resources/evolution/evolve_1.html
I believe it! :lol: I've done it on 6502 to fix the bugs in the common UM/MOD and UM*, then re-wrote them for the '816 to take advantage of its added capabilities; then I wrote a UM/MOD-like division routine for PIC16 for a precision that was not in the ap. notes. The division routines took a long time to get my head around. If you have to do it again for another processor or precision, the time to do it is immediately after another success, before you've forgotten what tends to trip you up.
Quote:
I need something similar, a 40/mod word for a Radix-50 and for screen positioning on a 40-column PET. Oh and it should run in less than 47 clock cycles and consume only 15 bytes of RAM. I'm thinking about basing "40/mod" ( u -- u%40 u/40 ) on a pair of words, "5*" ( u -- u*5 ) and "5/" ( u -- u/5 ) [sans quotes]. Calculating 1/5 in binary is the same as multiplying by .0011001100110011... which looks like a pattern, something that could be readily squeezed and looped to reduce size and clocks required.
So you want a UM/MOD-type of action, and you mean 40 in decimal (ie, 28H)? And what do you mean by Radix-50? How much precision do you need? Is it 16 bits input and 8 and 8 output (remainder and quotient)?
Quote:
Lately I haven't had the time/energy to devote to this.
It definitely takes some, and I'm not committing the time right now myself, but having the above things clarified should make it easier for someone, even if only so they can point out some simple things that don't initially meet the eye.
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?
chitselb
Posts: 232
Joined: 21 Aug 2010
Location: Ontonagon MI
Contact:

Re: Dividing by seven

Post by chitselb »

http://en.wikipedia.org/wiki/DEC_Radix-50 This is Radix-50, based on the idea of encoding three characters in two bytes, as long as there are only 40 unique characters in the allowed character set. Because 40x40x40 = 64000 which fits neatly in an unsigned 16-bit integer. It's "50" not "40" because octal.

so, 16 bit dividend, 8-bit divisor (it's 40 decimal), 16-bit quotient, 8-bit remainder to divide by 40 decimal
teamtempest
Posts: 443
Joined: 08 Nov 2009
Location: Minnesota
Contact:

Re: Dividing by seven

Post by teamtempest »

Quote:
as long as there are only 40 unique characters in the allowed
Of course you could always extend that number by using one of the characters an "escape" or "use alternate charset" control character. Presumably the additional characters would be the less-often used ones from the full charset needed, so the escape would apply for the next character only. Otherwise you could have two (or three or four) "set character set in use" control characters that would apply until the next control character was encountered.

Or any number of other encoding schemes. Not that, as far as I know, DEC did anything of the sort beyond defining a couple of 40-character alphabets.
User avatar
barrym95838
Posts: 2056
Joined: 30 Jun 2013
Location: Sacramento, CA, USA

Re: Dividing by seven

Post by barrym95838 »

chitselb wrote:
... I need something similar, a 40/mod word for a Radix-50 and for screen positioning on a 40-column PET. Oh and it should run in less than 47 clock cycles and consume only 15 bytes of RAM...
Hey CH,

Here's a first attempt that I got by modifying the div routine from my VTL02 for a hard-coded divisor of 40 and trimming out all the code that got neutered as a result. I'm getting ready to throw some tests at it, and I'll correct any possible bugs that I or others may find, but this is what it looks like for now:

Code: Select all

; - - - - - - - - - - - - - - - - - - - - - - - - - - -
; 16-bit unsigned division by 40 routine
;   TOS /= 40, A = remainder, Y = 0
;
div40:
        lda  #0         ; remainder
        ldy  #16        ; loop counter
div40b:
        asl  TOS        ; TOS is gradually replaced
        rol  TOS+1      ;   with the quotient
        rol             ; A is gradually replaced
                        ;   with the remainder
        cmp  #40        ; partial remainder >= 40?
        bcc  div40c
        sbc  #40        ;   yes: update partial
                        ;     remainder, set low bit
        inc  TOS        ;     in partial quotient
div40c:
        dey  
        bne  div40b     ; loop 16 times
        rts  
I know that it's a few bytes larger (20 + rts) and much slower (355 + rts best case, ~400 typical) than your requested spec., but it's a starting point ...

On my 65m32 Forth, TOS is in register a already, and the remainder ends up in volatile carry register c, so it needs to be used immediately before it gets lost:

Code: Select all

:f8860000        clc             ; zero high word
:24060028        div  #40        ; TOS /= 40, c = remainder
My TOS is 32-bits wide, so it can hold six Radix-40 chars.

Mike
Omegamatrix
Posts: 7
Joined: 25 Jan 2015

Re: Dividing by seven

Post by Omegamatrix »

chitselb wrote:
I need something similar, a 40/mod word for a Radix-50 and for screen positioning on a 40-column PET. Oh and it should run in less than 47 clock cycles and consume only 15 bytes of RAM.
That's a very fast specification. :shock: Also, I am only familiar with the 2600. Reading what Barry wrote I have to ask are you sticking the entire routine in ram, and thus it can only be 15 bytes long? That too is very tight. If you just mean 15 bytes of zero page ram then that's not too bad.

I wrote a few different div 40 mod 40routines, but this is the best I came up with. I've tested it and it's working. It is geared towards speed, but is still over your spec by at least 96 cycles. It is probably way more bytes than you want to spend too. You can still gain a few more cycles by using preloading dividendHigh, not storing quotientLow when leaving, and using TAX to save the remainder instead of STA remainder. I'm sure there is more optimization too, but not 96 cycles worth unless you switch to using very large tables, and even then it might be tight.

Code: Select all

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;  16 bit unsigned divide by 40, mod 40
;  By Omegamatrix
;  142/143 cycles, 118 bytes
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

LowQuotientAdjust:
    .byte 0,6,12,19,25

;   (0 / 40) % 32 = 0
; (256 / 40) % 32 = 6
; (512 / 40) % 32 = 12
; (768 / 40) % 32 = 19
;(1024 / 40) % 32 = 25
;(1280 / 40) % 32 = 0  <--- starts to repeat


Mod40_DivHigh:
    .byte 0,16,32,8,24

;    0 % 40 = 0
;  256 % 40 = 16
;  512 % 40 = 32
;  768 % 40 = 8
; 1024 % 40 = 24
; 1280 % 40 = 0    <--- starts to repeat


FortyCountTab:
    .byte 0,40,80,120,160,200,240


StartDivide40Mod40:
    lda    dividendHigh          ;3  @3
    sta    temp                  ;3  @6
    lsr                          ;2  @8
    adc    #13                   ;2  @10
    adc    temp                  ;3  @13
    ror                          ;2  @15
    lsr                          ;2  @17
    lsr                          ;2  @19
    adc    temp                  ;3  @22
    ror                          ;2  @24
    adc    temp                  ;3  @27
    ror                          ;2  @29
    and    #$FC                  ;2  @31
    sta    temp                  ;3  @34
    lsr                          ;2  @36
    lsr                          ;2  @38
    tay                          ;2  @40
    adc    temp                  ;3  @43
    eor    #$FF                  ;2  @45
    sec                          ;2  @47
    adc    dividendHigh          ;3  @50
    tax                          ;2  @52
    lda    temp                  ;3  @55
    asl                          ;2  @57
    asl                          ;2  @59
    asl                          ;2  @61
    ora    LowQuotientAdjust,X   ;4  @65
    sta    quotientLow           ;3  @68
    tya                          ;2  @70
    lsr                          ;2  @72
    lsr                          ;2  @74
    lsr                          ;2  @76
    sta    quotientHigh          ;3  @79
    lda    #0                    ;2  @81
    ldy    dividendLow           ;3  @84
    cpy    #40                   ;2  @86
    rol                          ;2  @88
    cpy    #80                   ;2  @90
    adc    #0                    ;2  @92
    cpy    #120                  ;2  @94
    adc    #0                    ;2  @96
    cpy    #160                  ;2  @98
    adc    #0                    ;2  @100
    cpy    #200                  ;2  @102
    adc    #0                    ;2  @104
    cpy    #240                  ;2  @106
    adc    #0                    ;2  @108
    sta    temp                  ;3  @111
    tya                          ;2  @113
    ldy    temp                  ;3  @116
    adc    Mod40_DivHigh,X       ;4  @120
    sec                          ;2  @122
    sbc    FortyCountTab,Y       ;4  @126
    cmp    #40                   ;2  @128
    bcc    .storeMod40           ;2³ @130/131
    sbc    #40                   ;2  @132    leaving carry set to increase quotientLow by 1 below...
.storeMod40:
    sta    remainder             ;3  @135
    tya                          ;2  @137
    adc    quotientLow           ;3  @140
    sta    quotientLow           ;3  @143
The approach I'm using is to take the high dividend and divide it by 5 and get the remainder by 5. From there I use some tables to adjust the routine, and then deal with the low dividend.


If you truly need a routine under 47 cycles... I don't think you'll get there unless you use a lot of page size tables.

@barrym95838 I like your routine. :) Nice compact code.
User avatar
barrym95838
Posts: 2056
Joined: 30 Jun 2013
Location: Sacramento, CA, USA

Re: Dividing by seven

Post by barrym95838 »

Omegamatrix wrote:
@barrym95838 I like your routine. :) Nice compact code.
Thank you, sir. I am better at byte-counting than I am at cycle-counting, so my tendencies always seem to lean that way when I'm in the mood to optimize (usually prematurely). I haven't thrown the kitchen sink at my routine, but I believe that it can be made general-purpose by just putting the divisor in zero-page instead of hard-coding it, and it should produce valid unsigned 16-bit / 8-bit quotients and remainders for any divisor in [1 .. 255], anywhere that its slowness doesn't present a problem.

Mike B.
Omegamatrix
Posts: 7
Joined: 25 Jan 2015

Re: Dividing by seven

Post by Omegamatrix »

I sped up the routine I wrote, slightly. It's now 131 cycles worse case, and 111 cycles best case. It's still at the same 118 bytes too. :)

Code: Select all

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;  16 bit unsigned divide by 40, mod 40 (Ver 2)
;  By Omegamatrix
;  111-131 cycles, 118 bytes
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

LowQuotientAdjust:
    .byte 0,6,12,19,25

;   (0 / 40) % 32 = 0
; (256 / 40) % 32 = 6
; (512 / 40) % 32 = 12
; (768 / 40) % 32 = 19
;(1024 / 40) % 32 = 25
;(1280 / 40) % 32 = 0  <--- starts to repeat


Mod40_DivHigh:
    .byte 0,16,32,8,24

;    0 % 40 = 0
;  256 % 40 = 16
;  512 % 40 = 32
;  768 % 40 = 8
; 1024 % 40 = 24
; 1280 % 40 = 0    <--- starts to repeat


StartDivide40Mod40:
    lda    dividendHigh          ;3  @3
    sta    temp                  ;3  @6
    lsr                          ;2  @8
    adc    #13                   ;2  @10
    adc    temp                  ;3  @13
    ror                          ;2  @15
    lsr                          ;2  @17
    lsr                          ;2  @19
    adc    temp                  ;3  @22
    ror                          ;2  @24
    adc    temp                  ;3  @27
    ror                          ;2  @29
    and    #$FC                  ;2  @31
    sta    temp                  ;3  @34
    lsr                          ;2  @36
    lsr                          ;2  @38
    tay                          ;2  @40
    adc    temp                  ;3  @43
    eor    #$FF                  ;2  @45
    sec                          ;2  @47
    adc    dividendHigh          ;3  @50
    tax                          ;2  @52
    lda    temp                  ;3  @55
    asl                          ;2  @57
    asl                          ;2  @59
    asl                          ;2  @61
    ora    LowQuotientAdjust,X   ;4  @65
    sta    quotientLow           ;3  @68
    tya                          ;2  @70
    lsr                          ;2  @72
    lsr                          ;2  @74
    lsr                          ;2  @76
    sta    quotientHigh          ;3  @79
    lda    dividendLow           ;3  @82
    clc                          ;2  @84
    adc    Mod40_DivHigh,X       ;4  @88
    bcs    .moreThan255          ;2³ @90/91
    cmp    #120                  ;2  @92
    bcs    .more120              ;2³ @94/95
    ldx    #0-1                  ;2  @96
    sec                          ;2  @98
    bne    .less120              ;3  @101   always branch

.moreThan255:
    ldx    #6                    ;2  @93
    adc    #255-240              ;2  @95
    cmp    #40                   ;2  @97
    bcc    .storeMod40           ;2³ @99/100
    sbc    #40                   ;2  @101
    bcs    .storeMod40           ;3  @104   always branch

.more120:
    sbc    #120                  ;2  @97
    ldx    #3-1                  ;2  @99
.less120:
    sbc    #40                   ;2  @101...103
    bcc    .add40                ;2³ @105/106
    inx                          ;2  @107
    sbc    #40                   ;2  @109
    bcc    .add40                ;2³ @111/112
    inx                          ;2  @113
    sbc    #40                   ;2  @115
    bcc    .add40                ;2³ @117/118
    inx                          ;2  @..117
    bcs    .storeMod40           ;3  @120  always branch

.add40:
    adc    #40                   ;2  @120
.storeMod40:
    sta    remainder             ;3  @123
    txa                          ;2  @125
    adc    quotientLow           ;3  @128
    sta    quotientLow           ;3  @131 worse case, 111 cycles best case
The user could still choose to use TAY instead of STA remainder, and skip the STA quotientLow at the end to save 4 more cycles. Even at that it's still too slow, but I wanted to see if I could improve it even a little. The code is certainly a lot harder to follow now, but it's tested and working.
Omegamatrix
Posts: 7
Joined: 25 Jan 2015

Re: Dividing by seven

Post by Omegamatrix »

Here is a table based version that is under 67 cycles, unless you are using a subroutine. Hope it helps! It's a bugger for bytes I know... :wink:

Code: Select all

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;  16 bit unsigned divide by 40, mod 40 (Ver 3)
;  By Omegamatrix
;  45-65 cycles, 581 bytes
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

StartDivide40Mod40:
    ldy    dividendHigh          ;3  @3
    lda    HighQuot_Mod40Tab,Y   ;4  @7
    and    #$07                  ;2  @9
    sta    quotientHigh          ;3  @12
    eor    HighQuot_Mod40Tab,Y   ;4  @16
    clc                          ;2  @18
    adc    dividendLow           ;3  @21

    bcs    .moreThan255          ;2³ @23/24
    cmp    #120                  ;2  @25
    bcs    .more120              ;2³ @27/28
    ldx    #0-1                  ;2  @29
    sec                          ;2  @31
    bne    .less120              ;3  @34   always branch

.moreThan255:
    ldx    #6                    ;2  @26
    adc    #255-240              ;2  @28
    cmp    #40                   ;2  @30
    bcc    .storeMod40           ;2³ @32/33
    sbc    #40                   ;2  @34
    bcs    .storeMod40           ;3  @37   always branch

.more120:
    sbc    #120                  ;2  @30
    ldx    #3-1                  ;2  @32
.less120:
    sbc    #40                   ;2  @34...36 worse case
    bcc    .add40                ;2³ @38/39
    inx                          ;2  @40
    sbc    #40                   ;2  @42
    bcc    .add40                ;2³ @44/45
    inx                          ;2  @46
    sbc    #40                   ;2  @48
    bcc    .add40                ;2³ @50/51

    inx                          ;2  @..50
    bcs    .storeMod40           ;3  @53  always branch

.add40:
    adc    #40                   ;2  @53
.storeMod40:
    sta    remainder             ;3  @56
    txa                          ;2  @58
    adc    LowQuotientTab,Y      ;4  @62
    sta    quotientLow           ;3  @65 worse case, 45 best case



HighQuot_Mod40Tab:
    .byte $00,$10,$20,$08,$18,$00,$10,$20,$08,$18,$00,$10,$20,$08,$18,$00
    .byte $10,$20,$08,$18,$00,$10,$20,$08,$18,$00,$10,$20,$08,$18,$00,$10
    .byte $20,$08,$18,$00,$10,$20,$08,$18,$01,$11,$21,$09,$19,$01,$11,$21
    .byte $09,$19,$01,$11,$21,$09,$19,$01,$11,$21,$09,$19,$01,$11,$21,$09
    .byte $19,$01,$11,$21,$09,$19,$01,$11,$21,$09,$19,$01,$11,$21,$09,$19
    .byte $02,$12,$22,$0A,$1A,$02,$12,$22,$0A,$1A,$02,$12,$22,$0A,$1A,$02
    .byte $12,$22,$0A,$1A,$02,$12,$22,$0A,$1A,$02,$12,$22,$0A,$1A,$02,$12
    .byte $22,$0A,$1A,$02,$12,$22,$0A,$1A,$03,$13,$23,$0B,$1B,$03,$13,$23
    .byte $0B,$1B,$03,$13,$23,$0B,$1B,$03,$13,$23,$0B,$1B,$03,$13,$23,$0B
    .byte $1B,$03,$13,$23,$0B,$1B,$03,$13,$23,$0B,$1B,$03,$13,$23,$0B,$1B
    .byte $04,$14,$24,$0C,$1C,$04,$14,$24,$0C,$1C,$04,$14,$24,$0C,$1C,$04
    .byte $14,$24,$0C,$1C,$04,$14,$24,$0C,$1C,$04,$14,$24,$0C,$1C,$04,$14
    .byte $24,$0C,$1C,$04,$14,$24,$0C,$1C,$05,$15,$25,$0D,$1D,$05,$15,$25
    .byte $0D,$1D,$05,$15,$25,$0D,$1D,$05,$15,$25,$0D,$1D,$05,$15,$25,$0D
    .byte $1D,$05,$15,$25,$0D,$1D,$05,$15,$25,$0D,$1D,$05,$15,$25,$0D,$1D
    .byte $06,$16,$26,$0E,$1E,$06,$16,$26,$0E,$1E,$06,$16,$26,$0E,$1E,$06

LowQuotientTab
    .byte $00,$06,$0C,$13,$19,$20,$26,$2C,$33,$39,$40,$46,$4C,$53,$59,$60
    .byte $66,$6C,$73,$79,$80,$86,$8C,$93,$99,$A0,$A6,$AC,$B3,$B9,$C0,$C6
    .byte $CC,$D3,$D9,$E0,$E6,$EC,$F3,$F9
    .byte $00,$06,$0C,$13,$19,$20,$26,$2C,$33,$39,$40,$46,$4C,$53,$59,$60
    .byte $66,$6C,$73,$79,$80,$86,$8C,$93,$99,$A0,$A6,$AC,$B3,$B9,$C0,$C6
    .byte $CC,$D3,$D9,$E0,$E6,$EC,$F3,$F9
    .byte $00,$06,$0C,$13,$19,$20,$26,$2C,$33,$39,$40,$46,$4C,$53,$59,$60
    .byte $66,$6C,$73,$79,$80,$86,$8C,$93,$99,$A0,$A6,$AC,$B3,$B9,$C0,$C6
    .byte $CC,$D3,$D9,$E0,$E6,$EC,$F3,$F9
    .byte $00,$06,$0C,$13,$19,$20,$26,$2C,$33,$39,$40,$46,$4C,$53,$59,$60
    .byte $66,$6C,$73,$79,$80,$86,$8C,$93,$99,$A0,$A6,$AC,$B3,$B9,$C0,$C6
    .byte $CC,$D3,$D9,$E0,$E6,$EC,$F3,$F9
    .byte $00,$06,$0C,$13,$19,$20,$26,$2C,$33,$39,$40,$46,$4C,$53,$59,$60
    .byte $66,$6C,$73,$79,$80,$86,$8C,$93,$99,$A0,$A6,$AC,$B3,$B9,$C0,$C6
    .byte $CC,$D3,$D9,$E0,$E6,$EC,$F3,$F9
    .byte $00,$06,$0C,$13,$19,$20,$26,$2C,$33,$39,$40,$46,$4C,$53,$59,$60
    .byte $66,$6C,$73,$79,$80,$86,$8C,$93,$99,$A0,$A6,$AC,$B3,$B9,$C0,$C6
    .byte $CC,$D3,$D9,$E0,$E6,$EC,$F3,$F9
    .byte $00,$06,$0C,$13,$19,$20,$26,$2C,$33,$39,$40,$46,$4C,$53,$59,$60
chitselb
Posts: 232
Joined: 21 Aug 2010
Location: Ontonagon MI
Contact:

Re: Dividing by seven

Post by chitselb »

barrym95838 wrote:
Hey CH,

Here's a first attempt that I got by modifying the div routine from my VTL02 for a hard-coded divisor of 40 and trimming out all the code that got neutered as a result. I'm getting ready to throw some tests at it, and I'll correct any possible bugs that I or others may find, but this is what it looks like for now:
Mike,

This is fantastic! It made my sad 40/MOD code look, well, sad. I'm very happy about that, because now the code is much, much better. Thank you!

code size. Yours is 1/6 the size.
Yours: 30 bytes
Mine: 179 bytes

running every possible value 0..65535 as an input argument. Yours ran in half the time.
Yours: 2717 jiffies
Mine: 5212 jiffies

I compared the outputs. Mine had a bug where 40/MOD returned 40 1637 (bogus!) and yours returned 0 1638 (excellent!) for the input value -16 (65520 or $FFF0)

Charlie
User avatar
barrym95838
Posts: 2056
Joined: 30 Jun 2013
Location: Sacramento, CA, USA

Re: Dividing by seven

Post by barrym95838 »

I'm honestly glad that I was able to help, CH. Now go and knock one out of the park for us. :D 8)

Mike B.
User avatar
barrym95838
Posts: 2056
Joined: 30 Jun 2013
Location: Sacramento, CA, USA

Re: Dividing by seven

Post by barrym95838 »

CH,

Just for S&Gs, maybe you could try plugging this one into your test bench: it's two bytes shorter and slightly faster, on average (upper 300s, still not nearly as fast as Omegamatrix's code)

Code: Select all

; - - - - - - - - - - - - - - - - - - - - - - - - - - -
; 16-bit unsigned division by 40 routine
;   TOS /= 40, A = remainder, Y = 0
;
div40:
        lda  #0         ; remainder
        ldy  #16        ; loop counter
div40b:
        cmp  #20        ; partial remainder >= 40 (/2)?
        bcc  div40c
        sbc  #20        ;   yes: update partial
                        ;     remainder, set carry
div40c:
        rol  TOS        ; TOS is gradually replaced
        rol  TOS+1      ;   with the quotient
        rol             ; A is gradually replaced
                        ;   with the remainder
        dey  
        bne  div40b     ; loop 16 times
        rts  
This version only works if the divisor is an even number, but 40 is an even number, so you should be good to go, unless I totally goofed.

Mike B.

[Edit: Actually, if my theory is correct, this little routine could provide proper results for any EVEN divisor in [2..510], as long as you save the carry on return for divisors above 256.]
Post Reply