6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Sun Nov 24, 2024 8:04 am

All times are UTC




Post new topic Reply to topic  [ 14 posts ] 
Author Message
 Post subject: Arithmetic shifts
PostPosted: Thu Apr 20, 2006 9:28 pm 
Offline

Joined: Sat Jan 04, 2003 10:03 pm
Posts: 1706
I'm working on some software (R&D for the Kestrel, basically), and realized a serious shortcoming of the 65xx instruction set -- no arithmetic shifts.

Yeah, you have the "arithmetic" shift left instruction, but let's face it, there is no difference between an arithmetic and logical shift left. The only time "arithmetic" vs. "logical" comes into play is when shifting to the right.

The difference is as follows: arithmetic shifts preserve the sign bit (so that -4/2 = -2, etc), while logical shifts don't.

Right now, the best code my brain can come up with to implement an arithmetic shift is as follows:

Code:
SHR A     ; This does the shift
STA temp
AND #$40  ; And now we restore the sign bit.
ASL A
ORA temp


Does anyone have anything better? Any tricks for handling multi-bit arithmetic shifts?

Thanks.


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Thu Apr 20, 2006 10:56 pm 
Online
User avatar

Joined: Fri Aug 30, 2002 1:09 am
Posts: 8546
Location: Southern California
6502 Forth does it this way for a two-byte cell on the ZP data stack:
Code:
CODE 2/
    LDA  1,X
    ASL  A
    ROR  1,X
    ROR  0,X
    JMP  NEXT
END-CODE


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Thu Apr 20, 2006 11:20 pm 
Offline

Joined: Wed Oct 22, 2003 4:07 am
Posts: 51
Location: Norway
Arithmetic shift right:
Code:
cmp #$80
ror a


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Fri Apr 21, 2006 4:23 am 
Offline

Joined: Sat Jan 04, 2003 10:03 pm
Posts: 1706
Thowllly wrote:
Arithmetic shift right:
Code:
cmp #$80
ror a


HEY, that's pretty slick! I don't know why I didn't think of that. Let me see if I have the theory of operation right.

CMP #$80 will compute A-$80. If A>=$80, carry flag will be set. Otherwise, it'll be clear. Basically, normal rules for SEC;SBC. I like it!

Garth, you may want to update your Forth implementation. It looks like this approach will be much faster -- at least for single bit shifts, and especially for the 65816. :)


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Fri Apr 21, 2006 5:29 am 
Online
User avatar

Joined: Fri Aug 30, 2002 1:09 am
Posts: 8546
Location: Southern California
> Garth, you may want to update your Forth implementation. It
> looks like this approach will be much faster -- at least for single
> bit shifts, and especially for the 65816.

I like it too, and I'm sure I'll find uses for it when the number is in A to start with. Thankyou Thowllly.

Its benefit is does not extend to the case where the number was not in A however. Re-writing the 2/ primitive this way, it becomes:
Code:
CODE 2/
    LDA  1,X
    CMP  #$80
    ROR  1,X
    ROR  0,X
    JMP  NEXT
END-CODE

which takes the same number of clocks but one more byte. For the '816, you get:
Code:
CODE 2/
    LDA  0,X
    CMP  #$8000
    ROR  0,X
    JMP NEXT
END-CODE

which takes one more clock and two more bytes than
Code:
CODE 2/
    LDA  0,X
    ASL  A
    ROR  0,X
    JMP  NEXT
END-CODE


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Fri Apr 21, 2006 8:27 am 
Offline

Joined: Sat Jan 04, 2003 10:03 pm
Posts: 1706
GARTHWILSON wrote:
Code:
CODE 2/
    LDA  0,X
    CMP  #$8000
    ROR  0,X
    JMP NEXT
END-CODE


If you cache the top of stack in A itself, you eliminate most of the above overhead.


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Fri Apr 21, 2006 8:33 am 
Online
User avatar

Joined: Fri Aug 30, 2002 1:09 am
Posts: 8546
Location: Southern California
> If you cache the top of stack in A itself, you eliminate most of
> the above overhead.

I might try that someday, but I guess it will require STC. In IDC, NEXT needs A.


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Sun Apr 23, 2006 11:02 am 
Offline

Joined: Wed Oct 22, 2003 4:07 am
Posts: 51
Location: Norway
GARTHWILSON wrote:
Its benefit is does not extend to the case where the number was not in A however.
That is true. However, if A is already in use for something else, you can use LDY/CPY
Code:
    LDY  1,X
    CPY  #$80
    ROR  1,X
    ROR  0,X
instead of saving/restoring A
Code:
    TAY
    LDA  1,X
    ASL  A
    ROR  1,X
    ROR  0,X
    TYA
and then it's a win in both size and speed. But I guess that scenario won't come up very often.


Top
 Profile  
Reply with quote  
 Post subject: Re: Arithmetic shifts
PostPosted: Sun Apr 23, 2006 1:16 pm 
Offline

Joined: Fri Aug 30, 2002 2:05 pm
Posts: 347
Location: UK
Quote:
Does anyone have anything better?

Possibly..
Code:
   CLC
   STA   temp
   ADC   temp

In this case the V flag is valid as it should be for a proper arithmetic shift.

Lee.


Top
 Profile  
Reply with quote  
 Post subject: Re: Arithmetic shifts
PostPosted: Sun Apr 23, 2006 2:08 pm 
Offline

Joined: Sat Jan 04, 2003 10:03 pm
Posts: 1706
leeeeee wrote:
Quote:
Does anyone have anything better?

Possibly..
Code:
   CLC
   STA   temp
   ADC   temp

In this case the V flag is valid as it should be for a proper arithmetic shift.

Lee.


Except you're shifting in the wrong direction. :) Remember, arithmetic shifts don't necessarily touch the V flag as such -- their defining element is the preservation of the *sign* of the number. I AND #$40 only to get the status of bit 6, which *used* to be bit 7, so that I can copy its value and put it back into bit 7.


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Sun Apr 23, 2006 2:20 pm 
Offline

Joined: Fri Aug 30, 2002 2:05 pm
Posts: 347
Location: UK
Solutions to the right shift problem had been provided but no one had bothered with the left shift case. Just thought you'd like both.

Lee.


Top
 Profile  
Reply with quote  
 Post subject: Re: Arithmetic shifts
PostPosted: Sun Apr 23, 2006 3:00 pm 
Offline

Joined: Fri Aug 30, 2002 2:05 pm
Posts: 347
Location: UK
Quote:
Any tricks for handling multi-bit arithmetic shifts?

Code:
   LSR            ; shift high ..
   LSR            ; .. nibble ..
   LSR            ; .. to low ..
   LSR            ; .. nibble
   CLC            ; clear carry for add
   ADC   #$F8      ; set top bits clear if -ve, set if +ve
   EOR   #$F8      ; toggle bits to the correct state   

Just change the #$F8 depending on how many bits you shifted.

Lee.


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Mon Apr 24, 2006 4:37 am 
Offline

Joined: Sat Jan 04, 2003 10:03 pm
Posts: 1706
leeeeee wrote:
Solutions to the right shift problem had been provided but no one had bothered with the left shift case. Just thought you'd like both.

Lee.


Ahh, that wasn't made clear, so I was thinking of the arithmetic shift right case. Sorry.


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Tue Apr 25, 2006 8:52 pm 
Online
User avatar

Joined: Fri Aug 30, 2002 1:09 am
Posts: 8546
Location: Southern California
Thowllly wrote:
GARTHWILSON wrote:
Its benefit is does not extend to the case where the number was not in A however.
That is true. However, if A is already in use for something else, you can use LDY/CPY
Code:
    LDY  1,X
    CPY  #$80
    ROR  1,X
    ROR  0,X
instead of saving/restoring A
Code:
    TAY
    LDA  1,X
    ASL  A
    ROR  1,X
    ROR  0,X
    TYA
and then it's a win in both size and speed. But I guess that scenario won't come up very often.


It's a little bit off-topic, but I guess I'll finally say it anyway. Any Forth primitive can use A any way you want and with no obligation to save it (like by using the TAY...TYA above), but the primitive's final value in A will get overwritten by NEXT before the next primitive in indirect-threaded Forth (ITC). For this reason, the top data stack cell cannot be kept in A as kc5tja was saying, since it would be lost between primitives. NEXT could be made to save it, but that would slow things down even more since NEXT gets used over 12,000 times per second per MHz on a 6502. Going to subroutine-threaded code (STC) Forth would eliminate NEXT, and then A would be untouched between primitives since the maximum overhead between them is RTS, JSR (to end one primitive and call the next one). Then, with the 65816, the 16-bit top data stack cell could just be kept in A with no worries of it getting stepped on between primitives.


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 14 posts ] 

All times are UTC


Who is online

Users browsing this forum: No registered users and 10 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: