6502.org • View topic - 16-bit arithmetic in registers

View unanswered posts | View active topics

Board index » 6502.org Users Forum » Programming

All times are UTC

16-bit arithmetic in registers

Page 1 of 1

[ 15 posts ]

Previous topic | Next topic

Author

Message

hjalfi

Post subject: 16-bit arithmetic in registers

Posted: Sat May 30, 2020 11:06 am

Joined: Thu Oct 12, 2017 10:51 pm
Posts: 87

Warning: thinking out loud follows...

I'm trying to write a compiler backend for the 6502 (again). The 6502 is rather hard to do this for due to it being a mixed memory-register machine. 8-bit arithmetic is fairly straightforward, but for 16-bit and above it's essentially a 3op machine where operations look like this:

Code:

lda lhs+n
adc rhs+n
sta dest+n

Worst case, when lhs rhs and dest are in normal memory, this ends up at nine bytes per byte of data: so a 16-bit add is 18 bytes. (Ignoring the clc for now.)

However, what's really bad from my perspective is that lhs, rhs and dest can have numerous different addressing modes. lhs and rhs can be one of {constant, memory ref, indirect memory ref}; dest can be one of {memory ref, indirect memory ref}, resulting in 18 different combinations. This means that my rule-based compiler backend needs 18 different rules. The code quality's also not good for chained operations, because I end up with one operation writing back to a temporary memory location and then the next reading it right back again.

What if I could keep intermediate results in registers instead?

Obviously this would only work for 16-bit arithmetic. I need to reserve Y for indexing so my 16-bit values go in XA. The best I've come up with is:

Code:

lda lhs+0
ldx lhs+1

adc rhs+0
pha
txa
adc rhs+1
tax
pla

sta dest+0
stx dest+1

While that's actually worse in total (22 vs 18 bytes), It should very quickly win if I want to do chained operations like i+j+k; I just need to repeat the central section, which is only ten bytes worst case. Plus, I now have three discrete operations, each of which can come in only two or three varieties, which fits the compiler better.

Does this seem reasonable? Am I missing anything obvious? Clearly I'll have to go the full hog for 32-bit arithmetic, but I think this could help 16-bit arithmetic a lot. But I don't like using the stack there...

Top

BigEd

Post subject: Re: 16-bit arithmetic in registers

Posted: Sat May 30, 2020 11:43 am

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10800
Location: England

Just thinking aloud: as ldy #0 is fairly cheap, you can bring Y into play at least temporarily. So, maybe, using TYA and similar, you can do without the expensive PHA and PLA.

Top

Chromatix

Post subject: Re: 16-bit arithmetic in registers

Posted: Sat May 30, 2020 1:34 pm

Joined: Mon May 21, 2018 8:09 pm
Posts: 1462

Taking the specific case of x = i+j+k:

Code:

  CLC
  LDA i+0
  ADC j+0
  PHP
  CLC
  ADC k+0
  STA x+0
  LDA i+1
  ADC j+1
  PLP
  ADC k+1
  STA x+1

In short, I would not try to keep multi-byte values in 6502 registers unless there's a very obvious benefit to doing so. Instead, try to work with the status flags efficiently and make good use of zero-page. It may be a useful abstraction for the compiler to treat (a section of) zero-page as a bank of registers.

Top

BillG

Post subject: Re: 16-bit arithmetic in registers

Posted: Sat May 30, 2020 1:37 pm

Joined: Thu Mar 12, 2020 10:04 pm
Posts: 690
Location: North Tejas

hjalfi wrote:

Am I missing anything obvious?

First of all, you are going to need to precede your first adc instruction with a clc to get the carry flag into a known state. Likewise, do sec before the first sbc.

It is generally better not to treat X:A or A:X as a single indivisible register and have to juggle both of them at the same time.

You can do multi-precision arithmetic using just the A register, leaving X and Y free for indexing, loop counting, etc:

Code:

    clc
    lda   rhs+0
    adc   lhs+0
    sta   lhs+0

    lda   rhs+1
    adc   lhs+1
    sta   lhs+1

    lda   rhs+2
    adc   lhs+2
    sta   lhs+2

    ; and so on...

You can do arbitrary precision with a loop:

Code:

    ldx   #0
    ldy   #4
    clc

Loop:
    lda   rhs,X
    adc   lhs,X
    sta   lhs,X

    inx
    dey
    bne   Loop

Top

hjalfi

Post subject: Re: 16-bit arithmetic in registers

Posted: Sat May 30, 2020 3:21 pm

Joined: Thu Oct 12, 2017 10:51 pm
Posts: 87

It's pretty easy to retrofit onto the compiler; compiling 'i := i + j + 7' produces:

Code:

4 lda ws0+0 ; i
4 ldx ws0+1
2 clc
4 adc ws0+2 ; j
3 pha
2 txa
4 adc ws0+3
2 tax
4 pla
2 clc
2 adc #7
3 pha
2 txa
2 adc #0 ; this could easily be optimised out
2 tax
4 pla
4 sta ws0+4 ; k
4 stx ws0+5

That's 32 bytes and 54 cycles. If I can use Y instead of the stack then it's 48 cycles. (Is pla really four cycles or is my chart wrong? That's more expensive than a zero page write!) Doing it using temporaries is:

Code:

2 clc
4 lda ws+0 ; i
4 adc ws+2 ; j
4 sta temp+0
4 lda ws+1
4 adc ws+3
4 sta temp+1
2 clc
4 lda temp+0
2 adc #7
4 sta ws0+4
4 lda temp+1
2 adc #0
4 sta ws0+5

...which is 36 bytes and 48 cycles. If temp is in zero page we save one byte each and one cycle per load, so it's 32 bytes and 46 cycles.

Assuming I got this right, this is the break-even point; for longer chained operations, registers win; for anything the same size or shorter, zero-page temporaries win. (Especially if I can use Y.) On the 65c02 using the plx instruction saves one byte and two cycles per operation.

Top

Dr Jefyll

Post subject: Re: 16-bit arithmetic in registers

Posted: Sat May 30, 2020 4:22 pm

Joined: Fri Dec 11, 2009 3:50 pm
Posts: 3354
Location: Ontario, Canada

hjalfi wrote:

Warning: thinking out loud follows...

Okay.. I think I'm ready...

Quote:

Does this seem reasonable? Am I missing anything obvious?

I wouldn't call it obvious, but an Identity table lets you do lots of things that are "impossible" on 65xx, such as adding X to A with just a single instruction! No serious assembly-language programmer's toolkit is complete without these.

Here are some other Synthetic instructions of which you may already be aware. And, some general 6502 assembly optimisations.

cheers
Jeff

_________________
In 1988 my 65C02 got six new registers and 44 new full-speed instructions!
https://laughtonelectronics.com/Arcana/ ... mmary.html

Top

barrym95838

Post subject: Re: 16-bit arithmetic in registers

Posted: Sat May 30, 2020 8:23 pm

Joined: Sun Jun 30, 2013 10:26 pm
Posts: 1929
Location: Sacramento, CA, USA

ZP,X evaluation stack, anyone? :wink:

_________________
Got a kilobyte lying fallow in your 65xx's memory map? Sprinkle some VTL02C on it and see how it grows on you!

Mike B. (about me) (learning how to github)

Top

BitWise

Post subject: Re: 16-bit arithmetic in registers

Posted: Sat May 30, 2020 9:05 pm

Joined: Tue Mar 02, 2004 8:55 am
Posts: 996
Location: Berkshire, UK

barrym95838 wrote:

ZP,X evaluation stack, anyone? :wink:

Yep, its an approach I've been looking at for my compilers.

_________________
Andrew Jacobs
6502 & PIC Stuff - http://www.obelisk.me.uk/
Cross-Platform 6502/65C02/65816 Macro Assembler - http://www.obelisk.me.uk/dev65/
Open Source Projects - https://github.com/andrew-jacobs

Top

GARTHWILSON

Post subject: Re: 16-bit arithmetic in registers

Posted: Sat May 30, 2020 9:13 pm

Joined: Fri Aug 30, 2002 1:09 am
Posts: 8433
Location: Southern California

and then (ZP,X) becomes very valuable.

_________________
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?

Top

BillG

Post subject: Re: 16-bit arithmetic in registers

Posted: Sun May 31, 2020 4:14 pm

Joined: Thu Mar 12, 2020 10:04 pm
Posts: 690
Location: North Tejas

Dr Jefyll wrote:

That technique is quite effective. But how to determine whether those 256 bytes is worth it. For a large and computationally intensive program, probably yes. For something small and limited in space, probably not.

Operating systems may want to put that table in a documented fixed location for the benefit of all.

Top

Dr Jefyll

Post subject: Re: 16-bit arithmetic in registers

Posted: Sun May 31, 2020 4:38 pm

Joined: Fri Dec 11, 2009 3:50 pm
Posts: 3354
Location: Ontario, Canada

BillG wrote:

Operating systems may want to put that table in a documented fixed location for the benefit of all.

Right. I wouldn't consider it to be part of any one program. It would be a facility provided by the system, just as is the processor's instruction set (which the Table substantially enhances).

_________________
In 1988 my 65C02 got six new registers and 44 new full-speed instructions!
https://laughtonelectronics.com/Arcana/ ... mmary.html

Top

MichaelM

Post subject: Re: 16-bit arithmetic in registers

Posted: Sun May 31, 2020 5:29 pm

Joined: Mon Apr 23, 2012 12:28 am
Posts: 760
Location: Huntsville, AL

Meant to post yesterday about the following synthetic instruction:

Code:

; A = -A
eor #$FF
sec
adc #0

That code snippet is intended to negate the accumulator.

I've always done this by complementing and adding one. The preceding snippet does that but it uses more instructions and requires more time than the following code snippet:

Code:

; A = -A
eor #$FF
inc

Other than not modifying the NV flags, I think the second code snippet accomplishes the objective of negating the accumulator. I also think that it will allow extension of the operation to wider operands.

Are there other considerations that I've not considered?

_________________
Michael A.

Top

BillG

Post subject: Re: 16-bit arithmetic in registers

Posted: Sun May 31, 2020 6:43 pm

Joined: Thu Mar 12, 2020 10:04 pm
Posts: 690
Location: North Tejas

MichaelM wrote:

I've always done this by complementing and adding one. The preceding snippet does that but it uses more instructions and requires more time than the following code snippet:

Code:

; A = -A
eor #$FF
inc

That code does not work on an NMOS 6502...

Top

MichaelM

Post subject: Re: 16-bit arithmetic in registers

Posted: Sun May 31, 2020 7:51 pm

Joined: Mon Apr 23, 2012 12:28 am
Posts: 760
Location: Huntsville, AL

BillG wrote:

That code does not work on an NMOS 6502...

It's not that the instruction sequence (using increment instead of addition) doesn't compute the negation of the accumulator. It's that the INC/DEC instructions do not affect the C flag. Something that I keep forgetting regarding these instructions on the 6502/65C02 processors. The result being that my code snippet using increment instead of addition won't work for the subject of the thread, 16-bit arithmetic in registers. :oops:

_________________
Michael A.

Top

BillG

Post subject: Re: 16-bit arithmetic in registers

Posted: Mon Jun 01, 2020 1:29 pm

Joined: Thu Mar 12, 2020 10:04 pm
Posts: 690
Location: North Tejas

BillG wrote:

MichaelM wrote:

I've always done this by complementing and adding one. The preceding snippet does that but it uses more instructions and requires more time than the following code snippet:

Code:

; A = -A
eor #$FF
inc

That code does not work on an NMOS 6502...

My point is that the NMOS 6502 does not have an inc a instruction.

Requiring a 65C02 or better is very tempting, but that excludes many machines.

Top

Page 1 of 1

[ 15 posts ]

Board index » 6502.org Users Forum » Programming

All times are UTC

Who is online

Users browsing this forum: No registered users and 13 guests

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum