Page 1 of 2
CPUs code density comparison
Posted: Mon Aug 25, 2014 12:05 am
by Aaendi
I'm curious to how the code density of different CPUs compare. I think the TLSC-900h (Neo Geo Pocket) has some pretty compact code. It is similar to the 68000 in performance and architecture, except that instructions are byte aligned instead of word aligned.
Re: CPUs code density comparison
Posted: Mon Aug 25, 2014 4:06 am
by GARTHWILSON
Someone posted a chart on that not long ago. Hopefully they or someone else will find it and give us the link to the topic. I can't find it at the moment.
Re: CPUs code density comparison
Posted: Mon Aug 25, 2014 5:33 am
by barrym95838
Weaver and McKee have done some comparisons, using their own benchmark. Your newer processor may or may not be on their list. I found it interesting that 6502 and ARM are not-too-distant neighbors in all of their charts on page 4.
http://web.eece.maine.edu/~vweaver/pape ... ensity.pdf
Mike
Re: CPUs code density comparison
Posted: Mon Aug 25, 2014 8:23 am
by BigEd
That's probably the paper Garth has in mind. I posted a picture at
viewtopic.php?f=1&t=1888&p=15526#p15526
but unfortunately the image hosting company has dropped the ball. (Back then we didn't have image attachments available as an option)
See also
viewtopic.php?p=15655#p15655
Edit: I resnapshotted the images from the pdf into that previous post.
Cheers
Ed
Re: CPUs code density comparison
Posted: Fri Aug 29, 2014 8:49 am
by Bregalad
Oh, I remember I found this paper a while ago, and I was so happy to see the 6502 included in the comparison.
Also, it seems the 6502 is systematically the worse of it's category when it comes to code density. Not so good news. Our only consolation is that the 6502 is way faster for a fixed clock rate.
Re: CPUs code density comparison
Posted: Wed Sep 03, 2014 1:01 pm
by Aaendi
I'm not too surprised that the 6502 doesn't have the best code density. It takes two instructions to add to the accumulator, and four instructions to add to the index registers.
Re: CPUs code density comparison
Posted: Wed Sep 03, 2014 6:42 pm
by Alienthe
Is the assembly code used in the comparison available? Whittling down a few bytes here and there could improve the score significantly. My understanding from the paper was that they hand coded the programs. The skill of the programmer then becomes important.
There were many surprises in the paper, particularly that big endian is more compact then little endian, even if the difference is small. Correlation with the year the architecture was introduced was amusing.
CRIS did well and looking through the documentation for it I wonder exactly what makes it more compact and also why it seems to be abandoned in favour of ARM.
An update of this paper would be interesting in view of BA2 and the processors from the recently de-cloaked Andes Technology, where I believe they have an 8-bit processor which then would compete with 6502. Also a comparison using SWEET16 would be intriguing.
Re: CPUs code density comparison
Posted: Thu Sep 04, 2014 12:47 am
by jgharston
Weaver and McKee have done some comparisons, using their own benchmark. Your newer processor may or may not be on their list. I found it interesting that 6502 and ARM are not-too-distant neighbors in all of their charts on page 4.
http://web.eece.maine.edu/~vweaver/pape ... ensity.pdfThat sort-of reflects my experience in coding the same application for different CPUs. The same BBC BASIC interpreter is about 16K in 6502, and about 12K in Z80, about 16K in 80x86. There are similar code size differences in my
hand-crafted CRC code. It also suggests that my PDP-11 BBC BASIC interpreter should end up the a comparable size to the Z80 one, which suggests I can look for some optimisation in it somewhere.
Edit: Ooo! The PDP-11 source cites me!

Re: CPUs code density comparison
Posted: Sun Jun 21, 2015 9:08 pm
by Alienthe
The paper is from 2009. I tried to find updates or source code with no success until I saw this:
http://lwn.net/Articles/647636/#Comments with a link to
an update. There is a
Git repository.
Of course it is possible that the author is a capacity on assembly programming but the below extract suggests there is some room for improvement in coding style for 6502:
Code: Select all
158 ; save zero page
159 ; otherwise we can't return to BASIC
160
161 ldx #$e8 ; we save $E8-$FF
162 ldy #0
163 lda #>zp_save
164 sta OUTPUTH
165 lda #<zp_save
166 sta OUTPUTL
167 save_zp_loop:
168 lda 0,X
169 sta (OUTPUTL),Y
170 inx
171 iny
172 cpy #$17 ; save 16 bytes
173 bne save_zp_loop
And there is more like this. 6502 stands at 1130 bytes, Z80 is at 891 bytes. It does smell like a challenge, doesn't it?
Re: CPUs code density comparison
Posted: Sun Jun 21, 2015 10:03 pm
by GARTHWILSON
That's more than twice as long as it needs to be for the job.
Re: CPUs code density comparison
Posted: Mon Jun 22, 2015 3:58 am
by BigEd
Good find Alienthe, thanks!
As it's on github, it's very easy to make an edit and then make a pull request.
Here's the code:
https://github.com/deater/ll_asm/blob/m ... 502.s#L158
The pencil icon at the top right of the text box makes a temporary fork.
(Of course, you'd need to be logged into github.)
Re: CPUs code density comparison
Posted: Mon Jun 22, 2015 6:12 am
by barrym95838
Mr. Weaver knows more assembly languages than I can name, but it appears that he didn't expend much effort on the 6502 version, for either size or speed. In fact, many of his coding techniques (like the zp_save stuff above) look like a crude translation from a different processor's source. I'm sure that optimization priorities differed from processor to processor, and I have a feeling that the 6502 wasn't the only victim here, but dang, that 6502 source is crammed full of sub-optimal code!
I believe that I could get identical results with identical inputs on an Apple 2-anything in under 800 (780!?!?!) bytes, but I have too much on my plate right now to prove that claim in a timely fashion. I'll just add it to my lengthy to-do list, and get to it when I can. Hopefully, someone awesome here can beat me to the punch, because this challenge should not go unanswered.
Mike B.
[Edit: Important footnote from
here:
* the 6502 results were adjusted to match the code present in other
architectures (i.e., not counting the graphical routines)
There is a lot of low-hanging fruit in the graphics routines, which wouldn't count, but I believe that there is considerable room for improvement elsewhere, which does count.
]
Re: CPUs code density comparison
Posted: Mon Jun 22, 2015 11:48 pm
by Martin_H
I would think that a single index register would do the trick if you adjusted the value of zp_save by $e8. That way the index and the offset could be the same value. If you used a count down to zero you could get rid of the cpy #$17 at the bottom of the loop.
Re: CPUs code density comparison
Posted: Tue Jun 23, 2015 4:00 am
by barrym95838
Exactly, Martin ... you're on the right track. And that's just a small example in a much larger source.
Code: Select all
; save zero page
; otherwise we can't return to BASIC
ldx #$17 ; we save $E8-$FF
save_zp_loop:
lda $e8,x
sta zp_save,x
dex
bpl save_zp_loop
Mike B.
Re: CPUs code density comparison
Posted: Tue Jun 23, 2015 7:20 am
by BigEd
I've raised an issue:
https://github.com/deater/ll_asm/issues/5
(A pull request would have been slightly more friendly!)