Wouldn't it be possible to push the few zero page locations onto stack rather?
A more intriguing solution is to save data onto the SWEET-16 register are and, for added bonus, use it to do the 16-bit heavy lifting. After all the author brought on the limitations from using Apple 2 as the platform.
Alternatively one could look closer at the inc16 subroutine https://github.com/deater/ll_asm/blob/m ... 502.s#L656
The INX seem rather spurious. Also the calling to inc16 is suspect as the pattern is wider than what is done in inc16.
Next the use of inc16 rather than incrementing Y seems odd. Also numerous LDY #0 without intervening changes to Y suggests 6502 is not a familiar processor.
There sure are many low hanging fruits here.
CPUs code density comparison
Re: CPUs code density comparison
Good spot: even if needed, that inc16 could usefully be a macro. But it should lose the INX.
Re: CPUs code density comparison
True, it could be a macro but then again the code density would drop like a brick. In this case the name of the game is code density and everything else can be sacrificed. So to expand on my earlier comment, the typical use case is like this:
This in a high level language is just A = *P++
Baking it all in into a sub routine would be something likewherein loadinc16 would be
In the critical part the pointer is LOGOL so that could be hardwired too.
Code: Select all
lda (POINTER),Y ; load byte
ldx #POINTER ; 16-bit increment
jsr inc16Baking it all in into a sub routine would be something like
Code: Select all
ldx #POINTER
jsr loadinc16Code: Select all
loadinc16:
lda 0,x
sta P
lda 1,x
sta P+1
lda (p),y
inc 0,X ; increment address
bne no_carry
inc 1,X ; handle overflow
no_carry:
rtsRe: CPUs code density comparison
Oops, yes, density, not performance...
- barrym95838
- Posts: 2056
- Joined: 30 Jun 2013
- Location: Sacramento, CA, USA
Re: CPUs code density comparison
Alienthe wrote:
... This in a high level language is just A = *P++
Baking it all in into a sub routine would be something likewherein loadinc16 would be
In the critical part the pointer is LOGOL so that could be hardwired too.
Baking it all in into a sub routine would be something like
Code: Select all
ldx #POINTER
jsr loadinc16Code: Select all
loadinc16:
lda 0,x
sta P
lda 1,x
sta P+1
lda (p),y
inc 0,X ; increment address
bne no_carry
inc 1,X ; handle overflow
no_carry:
rtsCode: Select all
load_inc_out:
ldx #OUTPUTL
load_inc16:
lda (0,x) ; look mom, (dp,x) !!
inc16:
inc 0,x
bne no_carry
inc 1,x
no_carry:
rts
Mike B.
[p.s. I don't have any issues with the DOS subroutines (they were already well-written), but I'm making significant gains everywhere else, and should be able to give an estimate soon, time allowing. I'm slightly embarrassed to admit that I haven't yet determined why the DOS subroutines are even included in the source.
]
Re: CPUs code density comparison
barrym95838 wrote:
load_inc16:
lda (0,x) ; look mom, (dp,x) !!
lda (0,x) ; look mom, (dp,x) !!
Quote:
This also frees up the Y register for other useful things. I thought about using (dp),y but it just doesn't seem to fit into the decompression algorithm very well, except in the degenerate case of Y always being zero, and (as BigEd might rightly say) that isn't idiomatic for most well-written 6502 code.
Mike B.
Mike B.
- barrym95838
- Posts: 2056
- Joined: 30 Jun 2013
- Location: Sacramento, CA, USA