Page 1 of 1
Using S as extra index register
Posted: Tue Dec 30, 2003 11:10 am
by blargg
I recently wrote a software-based serial loader and had some fairly tight time constraints in the loader loop, with only a few clock cycles to spare before I would have had to resort to interleving operations. I had the idea of using the stack as a memory buffer pointer with auto decrement, which frees up an index register and runs 3-4 clock cycles less per byte. Obviously the context for use is fairly limited.
Code: Select all
tsx ; save current stack
stx stack_save
; s becomes the index pointer
; ... ; get data into a
pha ; equivalent to fictitious instructions:
; sta $100,s
; des
; ... ; loop
tsx
; ... ; do something with data at $101,x
ldx stack_save ; restore stack
txs
6502 tips with section about using stack as buffer
Posted: Wed Dec 31, 2003 11:03 am
by blargg
I just wrote a short collection of 6502 techniques that come to mind at the moment, and expanded on using S as a buffer index.
6502 tips
If I don't add much to this in a week or so, it could be added to the site. I was going to output it to HTML but I didn't see much benefit.
Re: 6502 tips with section about using stack as buffer
Posted: Fri Jan 02, 2004 4:17 am
by Mike Naberezny
If I don't add much to this in a week or so, it could be added to the site..
Nice, thanks.
6502.org is currently being revamped and will have a fresh look and new features in a few weeks or so. I'll save your tips and add them to the others during this major update. I have approximately fifty additional tips that are also going to be added to the site at that time.
Best Regards,
Mike
Posted: Fri Jan 02, 2004 6:17 am
by GARTHWILSON
I printed out blargg's tips to file them. I especially like the S-register ideas, but some others should have a note telling that they are to make up for a deficiency in the NMOS 6502, which only a few here on the forum have any reason to be using in place of the CMOS. The extra work-arounds are not needed for the CMOS 65c02 since it has more instructions. BIT# and BRA are a couple that come to mind from these tips.
A single byte can be skipped by overlapping instructions: ...
Code: Select all
sub1 sei
db $C9 ; cmp #immediate
sub2 cli
...
Steve Wozniak did this kind of thing in the Apple II system too IIRC, using a BIT or CMP absolute, in order to have various choices of what to load into the accumulator at the beginning of a routine, based on the exact entry point:
Code: Select all
label1: LDA #$10
.DB $CD
label2: LDA #$17
.DB $CD
label3: LDA #$2F
.DB
...etc.
This way you can enter at the right LDA# and the following ones become CMP# xx xx which take only one extra byte per entry point and do nothing to the accumulator. Just make sure it's ok to read the extra address, which in this case would be 17A9 and 2FA9. Since reading the status register of an I/O IC can change the status, reading it from outside an I/O routine could cause problems in some cases. This kind of thing caused some debugging nightmares with the NMOS 6502's extra read of invalid addresses in certain operations. (This too was corrected in the CMOS version.) It would be pretty unlikely however that you would have the status register of an I/O IC at an address ending in A9.
Re: 6502 tips with section about using stack as buffer
Posted: Fri Jan 02, 2004 12:12 pm
by Sprow
Hi,
The '6502_tips.txt' look good,and well worth making public via 6502.org.
However,the "Combining shift register and counter" tip seems to have an excess CLC in there,which seems unnecessary since the LSRA a two instructions later corrupts C anyway.
I'm sure with the "NMOS branch always" tip that you know that what's written takes the same number of bytes as JMP and yet 2 more cycles!
An further alternative would be
LDX#topbitsetnumber
BMI dest
Sprow.
Re: 6502 tips with section about using stack as buffer
Posted: Fri Jan 02, 2004 6:19 pm
by blargg
However,the "Combining shift register and counter" tip seems to have an excess CLC in there,which seems unnecessary since the LSRA a two instructions later corrupts C anyway.
Thanks for catching this. I had copied the code from a software-based serial routine and that was probably left over from setting up the start/stop bit.
I'm sure with the "NMOS branch always" tip that you know that what's written takes the same number of bytes as JMP and yet 2 more cycles!
It'd only be used where relocatability outweighed this cost (and other means of relocation weren't possible).