Page 2 of 2
Re: Generic for loop
Posted: Thu Feb 13, 2020 2:18 am
by cjs
x is never changed so why are you doing sta(scrLow,x)?...Would this not work more efficiently?
Code: Select all
sta (scrLow) ; put it on the screen
That addressing mode doesn't exist on the 6502. (At least, not in the NMOS version that the 2600 uses. Maybe they added it to some 65C02 variants.)
I find
this reference useful for quickly looking up the addressing modes available for each instruction. (And what flags are set by each instruction, too.)
Re: Generic for loop
Posted: Thu Feb 13, 2020 2:19 am
by Chromatix
x is never changed so why are you doing sta(scrLow,x)?
I imagine it's because the NMOS 6502 core - which is what you'd get in a 2600 - doesn't have a non-indexed indirect addressing mode. You either get pre-indexed by X, or post-indexed by Y.
The approach by sark02 looks sound in principle, taking advantage of the index registers and the post-indexed addressing mode. I don't have time today to fettle the setup code around it, but the core is useful. Its one limitation is that it doesn't easily support a final inner loop of less than 256 bytes; you have to put the remainder portion in the
first iteration. This may not matter in the end, though.
Re: Generic for loop
Posted: Thu Feb 13, 2020 2:56 am
by BillO
Okay, did some study on the responses, and thank you both for taking the time. I did leave the parenthesis in and may have conflated things a touch ...
So why not the use zero page addressing mode: sta scrLow ? scrLow is only a byte value, so will still refer to page 0. There should be little change, except for saving a few cycles using the zero page mode.
like:
Code: Select all
fill: lda #3 ; set the color
sta scrLow ; put it on the screen
inc scrLow ; increment low byte
bne fill ; loop if not zero
inc scrHigh ; increment high byte
lda scrHigh ; load high byte
cmp $03 ; compare high byte
bne fill ; != so increment low byte
lda scrLow ; load low byte
cmp $02 ; compare low byte
bne fill ; != so increment low byte
end: brk
Still trying to wrap my head around this, so pleae be patient...
Re: Generic for loop
Posted: Thu Feb 13, 2020 4:04 am
by sark02
Code: Select all
L1:
sta ($20),y
iny
bne L1
inc $21
dex
bpl L1
A quick addendum to my code:
Whereas sta (ind),y is 6 cycles, std abs,y is only 5, so if the loop can be in RAM rather than ROM, it can be modified to:
Code: Select all
L1:
sta $1234,y ; $1234 patched at runtime
iny
bne L1
inc L1+2 ; increment upper byte of STA address
dex
bpl L1
This uses (L1+2,L1+1) as the write address, as opposed to ($21,$20). Saves 1 cycle/iteration.
Re: Generic for loop
Posted: Thu Feb 13, 2020 4:17 am
by barrym95838
I don't know if it's just the kooky way my brain works, but I think you can save some code space but get 'er done just as quickly if your application can tolerate filling from high to low (untested):
Code: Select all
; Fill X*256+Y bytes of memory with A, based
; at ($21,$20) ... don't try to exceed 33023 bytes!
; exit: Y=0, X=$FF, A and base pointer preserved
fill:
pha
txa
clc
adc $21
sta $21
pla
cpy #0
beq fill3
fill2:
dey
sta ($20),y
bne fill2
fill3:
dec $21
dex
bpl fill2
inc $21
rts
Re: Generic for loop
Posted: Thu Feb 13, 2020 5:12 am
by cjs
I don't know if it's just the kooky way my brain works, but I think you can save some code space but get 'er done just as quickly if your application can tolerate filling from high to low....
Your brain works like a 6502! I guess that qualifies as "kooky...."
And yes, I've noticed this too. My bigint representation was initially little-endian, following the convention of the CPU. But I changed it to big-endian, going opposite the convention, specifically because on the 6502 you can save a CMP and perhaps more by counting down to zero rather than counting up to your target. A related technique is to use 1-based indexes for arrays rather than 0-based: the loop control is just
DEY followed by
BEQ done.
Re: Generic for loop
Posted: Thu Feb 13, 2020 6:22 am
by sark02
I don't know if it's just the kooky way my brain works, but I think you can save some code space but get 'er done just as quickly if your application can tolerate filling from high to low [...]
Nice one.