CPUs code density comparison

Let's talk about anything related to the 6502 microprocessor.
Alienthe
Posts: 60
Joined: 16 Apr 2012

Re: CPUs code density comparison

Post by Alienthe »

Wouldn't it be possible to push the few zero page locations onto stack rather?

A more intriguing solution is to save data onto the SWEET-16 register are and, for added bonus, use it to do the 16-bit heavy lifting. After all the author brought on the limitations from using Apple 2 as the platform.

Alternatively one could look closer at the inc16 subroutine https://github.com/deater/ll_asm/blob/m ... 502.s#L656
The INX seem rather spurious. Also the calling to inc16 is suspect as the pattern is wider than what is done in inc16.
Next the use of inc16 rather than incrementing Y seems odd. Also numerous LDY #0 without intervening changes to Y suggests 6502 is not a familiar processor.

There sure are many low hanging fruits here.
User avatar
BigEd
Posts: 11464
Joined: 11 Dec 2008
Location: England
Contact:

Re: CPUs code density comparison

Post by BigEd »

Good spot: even if needed, that inc16 could usefully be a macro. But it should lose the INX.
Alienthe
Posts: 60
Joined: 16 Apr 2012

Re: CPUs code density comparison

Post by Alienthe »

True, it could be a macro but then again the code density would drop like a brick. In this case the name of the game is code density and everything else can be sacrificed. So to expand on my earlier comment, the typical use case is like this:

Code: Select all

lda	(POINTER),Y               ; load byte
ldx	#POINTER                  ; 16-bit increment
jsr	inc16
This in a high level language is just A = *P++
Baking it all in into a sub routine would be something like

Code: Select all

ldx	#POINTER
jsr	loadinc16
wherein loadinc16 would be

Code: Select all

loadinc16:
lda     0,x
sta     P
lda     1,x
sta     P+1
lda     (p),y
inc     0,X                	 ; increment address
bne     no_carry
inc     1,X			 ; handle overflow
no_carry: 
rts
In the critical part the pointer is LOGOL so that could be hardwired too.
User avatar
BigEd
Posts: 11464
Joined: 11 Dec 2008
Location: England
Contact:

Re: CPUs code density comparison

Post by BigEd »

Oops, yes, density, not performance...
User avatar
barrym95838
Posts: 2056
Joined: 30 Jun 2013
Location: Sacramento, CA, USA

Re: CPUs code density comparison

Post by barrym95838 »

Alienthe wrote:
... This in a high level language is just A = *P++
Baking it all in into a sub routine would be something like

Code: Select all

ldx	#POINTER
jsr	loadinc16
wherein loadinc16 would be

Code: Select all

loadinc16:
lda     0,x
sta     P
lda     1,x
sta     P+1
lda     (p),y
inc     0,X                	 ; increment address
bne     no_carry
inc     1,X			 ; handle overflow
no_carry: 
rts
In the critical part the pointer is LOGOL so that could be hardwired too.
My first optimization pass through the code takes note of your observation that there are a lot of A=*(ptr[X]++) activities going on. My load_inc16 thus looks like this:

Code: Select all

load_inc_out:
        ldx  #OUTPUTL
load_inc16:
        lda  (0,x)              ; look mom, (dp,x) !!
inc16:
        inc  0,x
        bne  no_carry
        inc  1,x
no_carry:
        rts
This also frees up the Y register for other useful things. I thought about using (dp),y but it just doesn't seem to fit into the decompression algorithm very well, except in the degenerate case of Y always being zero, and (as BigEd might rightly say) that isn't idiomatic for most well-written 6502 code.

Mike B.

[p.s. I don't have any issues with the DOS subroutines (they were already well-written), but I'm making significant gains everywhere else, and should be able to give an estimate soon, time allowing. I'm slightly embarrassed to admit that I haven't yet determined why the DOS subroutines are even included in the source.
]
Alienthe
Posts: 60
Joined: 16 Apr 2012

Re: CPUs code density comparison

Post by Alienthe »

barrym95838 wrote:
load_inc16:
lda (0,x) ; look mom, (dp,x) !!
Yes, I should have seen that one. Well spotted.
Quote:
This also frees up the Y register for other useful things. I thought about using (dp),y but it just doesn't seem to fit into the decompression algorithm very well, except in the degenerate case of Y always being zero, and (as BigEd might rightly say) that isn't idiomatic for most well-written 6502 code.

Mike B.
Yes, my first thought was that A = *P++ does not fit well with 6502. My second thought was that I couldn't remember ever having a need for that.
Post Reply