6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Sun Sep 22, 2024 3:31 pm

All times are UTC




Post new topic Reply to topic  [ 22 posts ]  Go to page Previous  1, 2
Author Message
PostPosted: Sun Apr 26, 2020 4:32 pm 
Offline
User avatar

Joined: Mon Apr 23, 2012 12:28 am
Posts: 760
Location: Huntsville, AL
RichTW:

RichTW wrote:
Thanks for links Ed! Will try digesting that with a coffee! A skim read suggests that Goldschmidt division works with reals rather than integers, but I haven't really taken it in properly.

I have implemented the Goldschmidt Square Root Algorithm for fixed-point numbers, and it works just fine. In addition, a side benefit of the Goldschmidt Square Root Algorithm is that it simultaneously calculates the reciprocal of the square root: 1/sqrt(x). That is really the value that I was wanting because I needed that term in order to solve the Law of Cosines. Squaring the reciprocal of the square root enables division by multiplication of the reciprocal of the divisor.

I have a paper describing the algorithms and how to estimate the initial value of the reciprocal of the square root. It is based on an estimate derived for how the initial value for the floating point algorithm can be derived from the exponent of the floating point value. My initial estimate is based on the most significant bit of the fixed-point value for which I want to calculate the reciprocal of the square root. In my application, I only have to calculate the Law of Cosines twice in each control loop cycle. Therefore, I only looked for an initial value that would limit the number of iterations equally for the ranges in which I binned the input data. I also wanted the accuracy to be about the same for each of these bins, so I biased my estimate to accomplish this.

I'll have to sanitized the paper somewhat to remove some stuff that is not suitable for public disclosure. If the preceding is insufficient to get you started, send me a PM and I'll undertake to sanitize the paper over the next few days.

Added link to article on fixed-point Goldschmidt Square Root Algorithm.

_________________
Michael A.


Last edited by MichaelM on Tue Aug 11, 2020 5:10 am, edited 1 time in total.

Top
 Profile  
Reply with quote  
PostPosted: Sun Apr 26, 2020 4:39 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10938
Location: England
(BTW squaring can be a bit cheaper than general multiplication, if you're doing it by hand)


Top
 Profile  
Reply with quote  
PostPosted: Sun Apr 26, 2020 5:20 pm 
Offline

Joined: Mon May 21, 2018 8:09 pm
Posts: 1462
It seems that my first attempt does break if the denominator is more than 32767, in which case the first step of producing each bit can shift a set bit out of the numerator. So we can catch that case and handle it specially:
Code:
divide:
  LDA #$FE
  STA result  ; serves as indirect loop counter; calculating the final bit pushes a zero into the carry
@loop:
  ASL num_lo
  ROL num_hi
  BCS @big
  SEC
  LDA num_lo
  SBC denom_lo
  TAX
  LDA num_hi
  SBC denom_hi
  BCC :+      ; skip saving the remainder if it's negative
  STA num_hi
  STX num_lo
: ROL result
  BCS @loop
  RTS
@big:
  LDA num_lo
  SBC denom_lo
  STA num_lo
  LDA num_hi
  SBC denom_hi
  STA num_hi
  SEC
  ROL result
  BCS @loop
  RTS


Top
 Profile  
Reply with quote  
PostPosted: Mon Aug 10, 2020 3:21 pm 
Offline
User avatar

Joined: Tue Aug 04, 2020 10:35 pm
Posts: 8
Location: Verona, Italy
RichTW wrote:
...
My starting point was to look at a simpler case: dividing two 8 bit values, or, in this case, 16-bit divided by 8-bit, yielding an 8-bit value. To my moderate surprise, such a routine is already presented in the 6502 wiki:
Code:
   LDA TH
   LDX #8
   ASL TLQ
L1 ROL
   BCS L2
   CMP B
   BCC L3
L2 SBC B
;
; The SEC is needed when the BCS L2 branch above was taken
;
   SEC
L3 ROL TLQ
   DEX
   BNE L1



Hi friends! My very first post here (beside the introduction that I wrote in General Discussion).

Back in the late 80s and early 90s I did some C64 coding for pleasure and it was cool to understand what was really happening under the hood of a computer. I managed to learn little ML, and learnt to write (by looking at other people's code) some VIC effects, some scrolling, removing some (easy) protections... just for fun, no money!

In the last few months a desire come out: to switch on my old C= gear and have some fun with it. In the meantime that I can find some room for that gear, I started to have a look at X86 tools and spent some time in trying to find a good suite; I've decided that CBM Prog Studio seems the right one for me.

I am trying to get acquainted to basic ML development and I'm in trouble with Zaks 8-bit restoring division... I just can't get it working. In the book (http://retro.hansotten.nl/uploads/books/Programming_the_6502.pdf) there is only a flowchart, but not the code (fig. 3-19 @ PDF page 92 / book page 86). I tried to write the code on my own, but... it just doesn't work :-(

So I Googled and found a different way to make a restoring division, that's here https://www.ques10.com/p/11096/draw-the-flow-chart-for-restore-division-algorit-1/ and also https://www.youtube.com/watch?v=PzV6gYpVLuc and it works well:

Code:
          LDA #$00            ; A = will contain Remainder
          LDX #$08            ; How many loops? 8 (8 bits)
@Loop
          ASL Q               ; Q = Dividend --> will be Quotient at end of routine
          ROL                 ; Rotate Left A
          SEC                 ; C = 1
          SBC M               ; A = A - M (M = Divisor)
          BCS @NxtPls        ; If C is still set, no borrow occurred and move on in INC'g Q; if C is cleared, borrow occurred, so we need to restore A
          CLC                 ;
          ADC M               ; Restore A
          JMP @Skip                     
@NxtPls   
          INC Q               ; Q = Q + 1
@Skip
          DEX                 ;
          BNE @Loop           ;


If I try to translate Zaks' flowchart to code, I believe it should look like this:

Code:
   LDA #$00
   sta $Q
   LDX #$08    ; 8 bits = 8 loops
Loop
   ASL DVD  ; Dividend
   ROL Q    ; Quotient
   SEC
   SBC DVS  ; Divisor
   BCS NxtPls
   CLC
   ADC M; restore A
   JMP Skip
NxtPls
   Q=Q+1
Skip
   DEX
   BNE Loop


I understand that non-restoring will be easier and faster, but I'm trying to build my own knowledge and follow the books chapter after chapter; I got stuck on the division and I am kindly asking help to understand what's wrong with my implementation of Zaks' 8-bit restoring division.

Thanks,
Andrea


Top
 Profile  
Reply with quote  
PostPosted: Mon Aug 10, 2020 4:41 pm 
Offline

Joined: Tue Sep 03, 2002 12:58 pm
Posts: 325
The book doesn't explain it very clearly.

The algorithm uses three values. Two of them are combined into one double-width value, which starts with 0 in the top half and the dividend in the bottom half. The third value is the divisor, and never changes. When the algorithm is finished, the quotient is in the bottom half of the double-width value (the one that started out as the dividend), and the remainder is in the top half (which started as 0).

It's confusing, as the book talks about subtracting the divisor from the dividend, but you actually subtract from the value that the dividend is being shifted into.

An example should help. Here I am dividing 15 (00001111) by 3 (00000011).
Code:
00000000:00001111    (initial value 0:dividend)
00000000:00011110
00000000:00111100
00000000:01111000
00000000:11110000
00000001:11100000
00000011:11000000
subtract divisor, increment
00000000:11000001
00000001:10000010
00000011:00000100
subtract divisor, increment
00000000:00000101    (result: remainder:quotient)


Your code is almost right. You have DVD and Q initialised correctly (although Q will become the remainder, not the quotient). But you're then subtracting the divisor from A, not Q. And you're incrementing Q, not DVD.

You could fix it by adding instructions to load A from Q and store it back:
Code:
Loop
    ASL DVD
    ROL Q
    LDA Q
    SEC
    SBC DVS
    BCS NxtPls
    CLC
    ADC DVS
    JMP Skip
NxtPls
    INC DVD
Skip
    STA Q

The quotient will then end up in DVD and the remainder in Q (warning: I haven't tested this). But it's easier to keep Q in A, so you don't have to shuffle it back and forth to memory all the time.

You probably meant Q to be the quotient, but with this algorithm the quotient ends up in the same place as the dividend input. You could rename Q to R for remainder. But with it being stored in A, it never needs to get a name at all.


Top
 Profile  
Reply with quote  
PostPosted: Mon Aug 10, 2020 9:20 pm 
Offline
User avatar

Joined: Tue Mar 02, 2004 8:55 am
Posts: 996
Location: Berkshire, UK
The CLC before the ADC is not needed as the carry must already be clear or the BCS would have been executed.

_________________
Andrew Jacobs
6502 & PIC Stuff - http://www.obelisk.me.uk/
Cross-Platform 6502/65C02/65816 Macro Assembler - http://www.obelisk.me.uk/dev65/
Open Source Projects - https://github.com/andrew-jacobs


Top
 Profile  
Reply with quote  
PostPosted: Tue Aug 18, 2020 5:29 pm 
Offline
User avatar

Joined: Tue Aug 04, 2020 10:35 pm
Posts: 8
Location: Verona, Italy
Thank you friends, now I understand. IM(very)HO the explanation in the book is very hard to understand, at a point that it seems wrong if one is not very knowledgeable with the topic.

BitWise wrote:
The CLC before the ADC is not needed as the carry must already be clear or the BCS would have been executed.
That's true! Thank you for pointing out!

John West wrote:
Your code is almost right. You have DVD and Q initialised correctly (although Q will become the remainder, not the quotient). But you're then subtracting the divisor from A, not Q. And you're incrementing Q, not DVD.

You could fix it by adding instructions to load A from Q and store it back:
Code:
Loop
    ASL DVD
    ROL Q
    LDA Q
    SEC
    SBC DVS
    BCS NxtPls
    CLC
    ADC DVS
    JMP Skip
NxtPls
    INC DVD
Skip
    STA Q



Thak you very much for the explanation! I believe the code could be optimized by using:
Code:
    ...
    ASL DVD
    LDA Q
    ROL
    SEC
    SBC DVS
...

rather than:

Code:
...
    ASL DVD
    ROL Q
    LDA Q
    SEC
    SBC DVS
...


Right? I've implemented it and it seems to work well :) - this is the code with more meaningful labels:

Code:
          LDA #$00            ; will be my remainder
          STA RMD             
          LDX #$08            ; how many loops
@Loop
          ASL DVD/Q           ; Shift Left DVD/Q (I named the Dividend "DVD/Q")
          LDA RMD             ;
          ROL                 ; Rotate Left A (contains RMD)
          SEC
          SBC DVS             ; MATH RMD = RMD - DVS
          BCS @NxtPls         ; MATH If C is still 1, no borrow occurred... move on to increase quotient ***
          ADC DVS             ; otherwise if C = 0, I've "used" it so I need to restore A
          JMP @Skip           
@NxtPls   
          INC DVD/Q           ; ***
@Skip
          STA RMD             ; save RMD for next loop
          DEX                 
          BNE @Loop       


Now I see the code from an "understanding it point of view" and I think that some optimization can be made:

Code:
          LDA #$00            ; will be my remainder
;          STA RMD             ; don't need this right now
          LDX #$08            ; how many loops
@Loop
          ASL DVD/Q           ; Shift Left DVD/Q (I named the Dividend "DVD/Q")
;          LDA RMD             ; RMD is already in A
          ROL                 ; Rotate Left A (contains RMD)
          SEC
          SBC DVS             ; MATH RMD = RMD - DVS
          BCS @NxtPls         ; MATH If C is still 1, no borrow occurred... move on to increase quotient ***
          ADC DVS             ; otherwise if C = 0, I've "used" it so I need to restore A
          JMP @Skip           
@NxtPls   
          INC DVD/Q           ; ***
@Skip
;          STA RMD             ; no need to save RMD for next loop...
          DEX                 
          BNE @Loop       
          STA RMD             ; at the end, I save A to RMD


Quote:
... so you don't have to shuffle it back and forth to memory all the time...


... and finally I came on my own to use A and save lot of cycles, just as per your suggestion.

Thank you both once again, now it's much clearer. I am very grateful to you. Time to move on to other exercises in Zaks' book!!!

Andrea


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 22 posts ]  Go to page Previous  1, 2

All times are UTC


Who is online

Users browsing this forum: No registered users and 14 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: