6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Sat Apr 27, 2024 3:42 pm

All times are UTC




Post new topic Reply to topic  [ 44 posts ]  Go to page Previous  1, 2, 3
Author Message
PostPosted: Tue Feb 27, 2024 8:37 pm 
Offline

Joined: Sun Nov 08, 2009 1:56 am
Posts: 387
Location: Minnesota
Quote:
teamtempest wrote:
I only worked out one random example by hand. Kind of a distributed division:

<snip>

; adjust by hi byte

ldx dividend+1
clc
adc factor,x

factor:
.byte 0*85, 1*85, 2*85

$1FF aka 511 is a special case you will need to handle...



I thought you must be right, because you tested far more extensively than I did. Without that I thought it was possible something might be off.

But if I work it out by hand I get the correct result:

511 = %01 1111 1111

511/3 = 170

x(lo) = 255 = %1111 1111
x(lo)/3 = 85 = %0101 0101

x(hi) = 1 = %01 (so add 85)

85 + 85 = 170

Is there a mistake? A carry bit threw the calculation off somewhere?


Top
 Profile  
Reply with quote  
PostPosted: Tue Feb 27, 2024 9:42 pm 
Offline

Joined: Wed Aug 21, 2019 6:10 pm
Posts: 217
BillO wrote:
Is the following significantly smaller than a 756 byte look-up table?


Just look at the final assembly address (+1 since the origins are all $xx00) of each snippet to see vs $2F4.

2100: $21
2200: $2A
2300: $3A
2400: $2D
2500: $37
2600: $49
2700: $17
2800: $34

Seems like all of them would be.

BillG wrote:
. ...
Code:
 2000                     00532          org    FreeRAM
 2000                     00534 DIVIDE_BY_THREE:
...
 2025 60              [6] 00572          rts
 
2100                     00574          org    FreeRAM+$100
 2100                     00585 div3_ay:
...
 2120 60              [6] 00616          rts
 
 2200                     00618          org    FreeRAM+$200
 2200                     00651 Mine:
 220D 4C 2213         [3] 00663          jmp    EnterLoop
 2229 60              [6] 00692          rts

 2300                     00694          org    FreeRAM+$300
 2300                     00696 BB8:
 2335 60              [6] 00735          rts

 2400                     00737          org    FreeRAM+$400
 2400                     00739 OptimizedBB8:
 242C 60              [6] 00773          rts

 2500                     00775          org    FreeRAM+$500
 2500                     00780 ModifiedBB8:
 2536 60              [6] 00825          rts

 2600                     00827          org    FreeRAM+$600
 2600                     00829 MineTabled:
 262D 60              [6] 00870          rts
 2634 60              [6] 00879          rts
                          00880
 2635 00 00 00 01 01 01   00881 MyTable  ...
 2647 06 06

 2700                     00883          org    FreeRAM+$700
 2700                     00889 div3:
 2716 60              [6] 00906          rts

 2800                     00908          org    FreeRAM+$800
 2800                     00917 ModifiedBB8_plus_teamtempest:
 2834 60              [6] 00968          rts


Last edited by BruceRMcF on Wed Feb 28, 2024 2:46 am, edited 2 times in total.

Top
 Profile  
Reply with quote  
PostPosted: Tue Feb 27, 2024 10:06 pm 
Offline

Joined: Thu Mar 12, 2020 10:04 pm
Posts: 690
Location: North Tejas
teamtempest wrote:
But if I work it out by hand I get the correct result:

511 = %01 1111 1111

511/3 = 170

x(lo) = 255 = %1111 1111
x(lo)/3 = 85 = %0101 0101

x(hi) = 1 = %01 (so add 85)

85 + 85 = 170

Is there a mistake? A carry bit threw the calculation off somewhere?

You are right about 511. Faulty analysis on my part.

Now try 513 and 514 with your method...


Top
 Profile  
Reply with quote  
PostPosted: Tue Feb 27, 2024 10:09 pm 
Offline

Joined: Thu Mar 12, 2020 10:04 pm
Posts: 690
Location: North Tejas
BruceRMcF wrote:
2900: $122

That last one is the test driver and not a candidate division subroutine.


Top
 Profile  
Reply with quote  
PostPosted: Tue Feb 27, 2024 11:06 pm 
Offline
User avatar

Joined: Sun Nov 01, 2020 10:36 am
Posts: 35
Location: Tatooine
Further optimized... I'm using your suggested one byte DivBy3. And no loops anymore, only code going forward.
Now it should be at 49-67 +rts cycles when adopting zeropage variables.

Code:
        ldy nl
        lda nh          ; loads high byte
        beq lv          ; if zero then goes to low byte value
                        ; otherwise will sub 255 from low byte and add 85 to res
                        ; one or two times

        iny             ; subtracts 255 from low byte
        beq hv1         ; if zero, high byte is not done
        cmp #1          ; checks high byte
        beq hv2         ; if zero, high byte is done
hv1     lda #170        ; .A = 2 x 85 (partial quotient)
        iny             ; subtracts 255 from low byte
   byte $2c             ; *** the wonderful bit/$2c trick!! my first time... ***
hv2     lda #85         ; .A = 85 (partial quotient)
        sty nl

lv                      ;  here value is <256 (only low byte)
        sta res
        tya
        lsr
        adc #21
        lsr
        adc nl
        ror
        lsr
        adc nl
        ror
        lsr
        adc nl
        ror
        lsr

        clc             ; adds the partial quotient from the upper byte
        adc res
        sta res
        rts

res byte 0
nl  byte <726
nh  byte >726



Top
 Profile  
Reply with quote  
PostPosted: Tue Feb 27, 2024 11:12 pm 
Offline
User avatar

Joined: Fri Dec 12, 2008 10:40 pm
Posts: 1000
Location: Canada
BillG wrote:
BruceRMcF wrote:
2900: $122

That last one is the test driver and not a candidate division subroutine.

Gotcha.

_________________
Bill


Top
 Profile  
Reply with quote  
PostPosted: Wed Feb 28, 2024 12:15 am 
Online
User avatar

Joined: Sun Jun 30, 2013 10:26 pm
Posts: 1927
Location: Sacramento, CA, USA
BruceRMcF wrote:
2700: $20

You added nine or ten bytes to mine, depending on how you look at it.

_________________
Got a kilobyte lying fallow in your 65xx's memory map? Sprinkle some VTL02C on it and see how it grows on you!

Mike B. (about me) (learning how to github)


Top
 Profile  
Reply with quote  
PostPosted: Wed Feb 28, 2024 2:49 am 
Offline

Joined: Wed Aug 21, 2019 6:10 pm
Posts: 217
barrym95838 wrote:
BruceRMcF wrote:
2700: $20
You added nine or ten bytes to mine, depending on how you look at it.


Definitely added some. $20 instead of $17 would be adding 9.


Top
 Profile  
Reply with quote  
PostPosted: Wed Feb 28, 2024 7:17 pm 
Offline

Joined: Sat Feb 24, 2024 9:52 pm
Posts: 4
Your input here is more than I could ask for. A 'Thank you' seems inadequate. Nevertheless, thank you.


Top
 Profile  
Reply with quote  
PostPosted: Thu Feb 29, 2024 12:21 am 
Offline

Joined: Tue Jul 05, 2005 7:08 pm
Posts: 990
Location: near Heidelberg, Germany
I haven't closely followed I'm afraid but has this division algorithm been discussed so far? https://www.bbcelite.com/deep_dives/shi ... ision.html

Robin from YouTube's 8bit show and tell has pointed to it on Twitter/X
https://twitter.com/8BitShowAndTell/sta ... 9530750115
In its division-by-10 variant, but I believe it can be adopted to division-by-3.

André

_________________
Author of the GeckOS multitasking operating system, the usb65 stack, designer of the Micro-PET and many more 6502 content: http://6502.org/users/andre/


Top
 Profile  
Reply with quote  
PostPosted: Thu Feb 29, 2024 3:13 am 
Online
User avatar

Joined: Sun Jun 30, 2013 10:26 pm
Posts: 1927
Location: Sacramento, CA, USA
fachat wrote:
I haven't closely followed I'm afraid but has this division algorithm been discussed so far? https://www.bbcelite.com/deep_dives/shi ... ision.html

That's almost exactly what mine is doing. The smallest and slowest of the bunch.

_________________
Got a kilobyte lying fallow in your 65xx's memory map? Sprinkle some VTL02C on it and see how it grows on you!

Mike B. (about me) (learning how to github)


Top
 Profile  
Reply with quote  
PostPosted: Thu Feb 29, 2024 9:33 am 
Offline

Joined: Tue Jul 05, 2005 7:08 pm
Posts: 990
Location: near Heidelberg, Germany
barrym95838 wrote:
fachat wrote:
I haven't closely followed I'm afraid but has this division algorithm been discussed so far? https://www.bbcelite.com/deep_dives/shi ... ision.html

That's almost exactly what mine is doing. The smallest and slowest of the bunch.


Haha, I was almost expecting this, that you guys already know all that :-)

Great stuff!

_________________
Author of the GeckOS multitasking operating system, the usb65 stack, designer of the Micro-PET and many more 6502 content: http://6502.org/users/andre/


Top
 Profile  
Reply with quote  
PostPosted: Thu Feb 29, 2024 9:34 am 
Offline

Joined: Tue Jul 05, 2005 7:08 pm
Posts: 990
Location: near Heidelberg, Germany
What I like about the bbcelite division algorithm is that it is very small - and that it can be adopted to other argument sizes (and divisors) easily. I find that very neat.

_________________
Author of the GeckOS multitasking operating system, the usb65 stack, designer of the Micro-PET and many more 6502 content: http://6502.org/users/andre/


Top
 Profile  
Reply with quote  
PostPosted: Mon Apr 22, 2024 6:17 am 
Offline

Joined: Mon Apr 22, 2024 6:02 am
Posts: 9
Location: Germany
Hello!

I hope, I am not too late with my answer.
You can multiply the lower 8 bits of your 10 bit number by 171 and take the lower 8 bits of the result.
You don't even have to calculate the upper byte.
You can ignore the upper 2 bits of your 10 bit number. That works, because your numbers are a multiple of 3.

Use the fastest multiplying by 171 algorithm that you can come up with. (a 256 byte lookup table would work)

To divide by 5 you multiply your lower 8 bits by 205 and throw away the upper byte. (your input must be a multiple of 5)


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 44 posts ]  Go to page Previous  1, 2, 3

All times are UTC


Who is online

Users browsing this forum: No registered users and 24 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: