6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Sat Nov 23, 2024 2:21 am

All times are UTC




Post new topic Reply to topic  [ 18 posts ]  Go to page 1, 2  Next
Author Message
PostPosted: Mon May 23, 2016 10:38 am 
Offline

Joined: Mon Apr 04, 2016 10:04 am
Posts: 8
During my thread on the narrow topic of ADD/ADC/CLC, instructions set statistics were mentioned.

Whilst it is fairly easy to gather static statistics (unless someone's been really clever
with interspersed opcodes, the BIT trick, and/or SWEET16), dynamic statistics would have
been a good deal harder, "back in the day".

But the the era of real time simulators, and 4Ghz CPUs, gathering
dynamic statistics, even on a heavily interrupt driven game, should
be fairly simple.

So - has anyone done it, and if so, where can I find the data?

BugBear


Top
 Profile  
Reply with quote  
PostPosted: Mon May 23, 2016 7:36 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10986
Location: England
The one little piece of data I did find when this last came up was in Blargg's Emulation Notes.

Apart from instrumenting an emulator - and there are so many, in so many languages - the difficulty is deciding what program to run. A graphics demo? A game? A Basic interpreter? (Running what kind of program? Strings, trig, integer math...) If the program is itself compiled, that will be different from hand-coded assembly.

[Edit: here's a link to that previous thread, which was about why we don't see ADD in 6502, only ADC]


Last edited by BigEd on Wed May 25, 2016 8:42 am, edited 1 time in total.

Top
 Profile  
Reply with quote  
PostPosted: Mon May 23, 2016 9:55 pm 
Offline

Joined: Sat Mar 27, 2010 7:50 pm
Posts: 149
Location: Chexbres, VD, Switzerland
Quote:
If the program is itself compiled, that will be different from hand-coded assembly.

And if the program is hand-coded in assembly, two different coders will come up with two very different implementations.

(This is true as well for high level languages, but the impact will be less obvious).


Top
 Profile  
Reply with quote  
PostPosted: Tue May 24, 2016 7:56 am 
Offline

Joined: Mon Apr 04, 2016 10:04 am
Posts: 8
BigEd wrote:
The one little piece of data I did find when this last came up was in Blargg's Emulation Notes.

Apart from instrumenting an emulator - and there are so many, in so many languages - the difficulty is deciding what program to run. A graphics demo? A game? A Basic interpreter? (Running what kind of program? Strings, trig, integer math...) If the program is itself compiled, that will be different from hand-coded assembly.


So Blargg's numbers are the only ones?

As with all benchmarks, use the one closest to what you're interested in (that's why there are so many benchmarks).

BugBear


Top
 Profile  
Reply with quote  
PostPosted: Tue May 24, 2016 3:23 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10986
Location: England
I tweaked a copy of lib6502 and collected some numbers - but not from games! Find below some dynamic instruction frequencies, down to 0.5% level.

Edit: I feel a bit dubious now about some of the numbers below. Possibly I've lost out because of 32-bit integers wrapping. Or done something else wrong. See further below for a more reliable investigation in a JavaScript emulator.

First off, the CLOCKSP benchmark, a BASIC program covering strings, trigs, floats, loops, calls:
Code:
b1 184574 17.39% lda (zp),Y
c8 174013 16.39% iny
10 120410 11.34% bpl rel
d0 103806  9.78% bne rel
c9  69484  6.55% cmp imm
90  68522  6.45% bcc rel
a0  36727  3.46% ldy imm
d1  36703  3.46% cmp (zp),Y
85  36392  3.43% sta zp
65  33902  3.19% adc zp
4c  32483  3.06% jmp abs
98  31903  3.01% tya
38  30779  2.90% sec
91  20117  1.90% sta (zp),Y
f0  10655  1.00% beq rel
b0   9550  0.90% bcs rel
9d   8221  0.77% sta abs,X
ca   8193  0.77% dex
20   6053  0.57% jsr abs
60   5888  0.55% rts


Basic one-liner, computing and printing 4*ATN(1):
Code:
b1 184029 17.41% lda (zp),Y
c8 173426 16.41% iny
10 120044 11.36% bpl rel
d0 103334  9.78% bne rel
c9  69180  6.55% cmp imm
90  68302  6.46% bcc rel
a0  36616  3.46% ldy imm
d1  36560  3.46% cmp (zp),Y
85  36225  3.43% sta zp
65  33800  3.20% adc zp
4c  32384  3.06% jmp abs
98  31800  3.01% tya
38  30678  2.90% sec
91  20103  1.90% sta (zp),Y
f0  10507  0.99% beq rel
b0   9496  0.90% bcs rel
ca   8064  0.76% dex
9d   8064  0.76% sta abs,X
20   5963  0.56% jsr abs
60   5836  0.55% rts


The BBC Micro's OS and Basic initialisation from cold boot - dominated by memory test:
Code:
d0  32733 19.52% bne abs
f0  31918 19.03% beq abs
c8  31848 18.99% iny
91  31624 18.86% sta (zp),Y
c5  31622 18.86% cmp zp
9d   1772  1.06% sta abs,X


And here's the OS and Basic init with the memory test filtered out:
Code:
9d   1772 19.47% sta abs,X
d0   1112 12.22% bne rel
e8    691  7.59% inx
ca    570  6.26% dex
10    453  4.98% bpl rel
f0    297  3.26% beq rel
6c    293  3.22% jmp (ind)
b9    243  2.67% lda abs,Y
88    243  2.67% dey
99    192  2.11% sta abs,Y
20    172  1.89% jsr abs
8d    171  1.88% sta abs
90    159  1.75% bcc rel
e0    154  1.69% cpx imm
a9    149  1.64% lda imm
fe    144  1.58% inc abs,X
de    144  1.58% dec abs,X
a0    133  1.46% ldy imm
60    133  1.46% rts
8c    119  1.31% sty abs
bd    108  1.19% lda abs,X
c8    103  1.13% iny
8e     92  1.01% stx abs
85     85  0.93% sta zp
48     73  0.80% pha
68     72  0.79% pla
a2     68  0.75% ldx imm
b1     60  0.66% lda (zp),Y
ad     57  0.63% lda abs
4a     47  0.52% lsra
08     47  0.52% php
aa     45  0.49% tax
98     45  0.49% tya


Last edited by BigEd on Tue May 24, 2016 5:19 pm, edited 1 time in total.

Top
 Profile  
Reply with quote  
PostPosted: Tue May 24, 2016 3:51 pm 
Offline

Joined: Mon Apr 04, 2016 10:04 am
Posts: 8
Thank you very much for doing that - most interesting, and surprisingly rare.

(I'd love to see the numbers for a 15 minute run of Elite, but that would
be much harder to do, as I understand matters)

BugBear


Top
 Profile  
Reply with quote  
PostPosted: Tue May 24, 2016 4:06 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10986
Location: England
There is another tactic, used to advantage by one or two of the people at Stardot, whereby the 6502 is replaced by an FPGA version, and the FPGA additionally has a debug CPU running C code which can be controlled and interrogated over a serial link. It's probably enough to take some instruction counts, with a bit more FPGA whizzery. It might be worth posting the question over on Stardot.org.uk

Edit: I posted over there.


Top
 Profile  
Reply with quote  
PostPosted: Tue May 24, 2016 4:49 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10986
Location: England
...and Matt Godbolt very helpfully tells us how to tweak JSBeeb in our browsers to get stats!


Top
 Profile  
Reply with quote  
PostPosted: Tue May 24, 2016 5:07 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10986
Location: England
Here are stats from a couple of seconds of 3D action in *ELTDEMO (as a guess, dominated by multiply routines)
Code:
a5  1415140 10.91    lda zp
85  1197180  9.23    sta zp
90  1045328  8.06    bcc rel
d0   830356  6.40    bne rel
b0   783948  6.05    bcs rel
65   541956  4.18    adc zp
46   487256  3.76    lsr zp
ca   466852  3.60    dex
91   339916  2.62    sta (zp),Y
51   315776  2.44    eor (zp),Y
4a   313048  2.41    lsra
10   296756  2.29    bpl rel
26   275984  2.13    rol zp
c5   263440  2.03    cmp zp
88   263052  2.03    dey
29   256340  1.98    and imm
6a   255656  1.97    rora
0a   234412  1.81    asla
66   199360  1.54    ror zp
e5   196668  1.52    sbc zp
60   160620  1.24    rts
a9   159828  1.23    lda imm
20   159600  1.23    jsr abs
aa   152064  1.17    tax
38   134756  1.04    sec
8a   118152  0.91    txa
69   114116  0.88    adc imm
86   112448  0.87    stx zp
f0   110488  0.85    beq rel
49   106640  0.82    eor imm
b9    99680  0.77    lda abs,Y
45    87852  0.68    eor zp
a6    87752  0.68    ldx zp
c9    85180  0.66    cmp imm
c8    83944  0.65    iny
18    82632  0.64    clc
a2    73460  0.57    ldx imm
2a    72544  0.56    rola
06    68840  0.53    asl zp


And here are stats from a textual screen in ELTDEMO - note that there's an idle loop, unsurpringly:
Code:
a5  3978876 45.69    lda zp
f0  3960306 45.47    beq rel
d0    90813  1.04    bne rel
c8    61413  0.71    iny
85    59208  0.68    sta zp
91    51069  0.59    sta (zp),Y
10    35040  0.40    bpl rel
88    32889  0.38    dey
65    31758  0.36    adc zp
b1    30426  0.35    lda (zp),Y
ca    27897  0.32    dex
8d    21864  0.25
b9    16824  0.19    lda abs,Y
18    16152  0.19    clc
90    15198  0.17    bcc rel
60    13152  0.15    rts
20    12489  0.14    jsr abs
a8    12444  0.14    tay
51    12120  0.14    eor (zp),Y
98    11685  0.13    tya
b0    10347  0.12    bcs rel
aa     9984  0.11    tax
ad     9303  0.11    lda abs
8a     9174  0.11    txa
a9     9153  0.11    lda imm
48     8328  0.10    pha


Top
 Profile  
Reply with quote  
PostPosted: Tue May 24, 2016 5:51 pm 
Offline

Joined: Sun Jun 29, 2014 5:42 am
Posts: 352
BigEd wrote:
First off, the CLOCKSP benchmark, a BASIC program covering strings, trigs, floats, loops, calls:
Code:
b1 184574 17.39% lda (zp),Y
c8 174013 16.39% iny
10 120410 11.34% bpl rel
d0 103806  9.78% bne rel
c9  69484  6.55% cmp imm
90  68522  6.45% bcc rel
a0  36727  3.46% ldy imm
d1  36703  3.46% cmp (zp),Y
85  36392  3.43% sta zp
65  33902  3.19% adc zp
4c  32483  3.06% jmp abs
98  31903  3.01% tya
38  30779  2.90% sec
91  20117  1.90% sta (zp),Y
f0  10655  1.00% beq rel
b0   9550  0.90% bcs rel
9d   8221  0.77% sta abs,X
ca   8193  0.77% dex
20   6053  0.57% jsr abs
60   5888  0.55% rts


Here's a histogram I grabbed from CLOCKSP off the Pi running a 6502 Co Processor Emulator:
http://stardot.org.uk/forums/viewtopic. ... 40#p127390
Code:
18490279  // Opcode 85 - STA $00
17691086  // Opcode B1 - LDA ($00),Y
14216195  // Opcode F0 - BEQ
13593406  // Opcode C9 - CMP #$00
12270543  // Opcode D0 - BNE
10398501  // Opcode 90 - BCC
10267707  // Opcode A0 - LDY #$00
10167113  // Opcode A5 - LDA $00
8458574   // Opcode B0 - BCS
6142211   // Opcode C8 - INY
5694274   // Opcode 60 - RTS
5694274   // Opcode 20 - JSR $0000
5294884   // Opcode 65 - ADC $00
5002153   // Opcode C5 - CMP $00
4942786   // Opcode 88 - DEY
4410232   // Opcode A4 - LDY $00
3877815   // Opcode E0 - CPX #$00
3812030   // Opcode 84 - STY $00
3687346   // Opcode 26 - ROL $00
3520511   // Opcode 98 - TYA
3516203   // Opcode 66 - ROR $00
3093382   // Opcode 91 - STA ($00),Y
2480086   // Opcode E6 - INC $00
2421376   // Opcode E8 - INX
2293262   // Opcode BD - LDA $0000,X
2124962   // Opcode 30 - BMI
2099902   // Opcode A8 - TAY
1981695   // Opcode 0A - ASL A
1978686   // Opcode A9 - LDA #$00
1842832   // Opcode 18 - CLC
1779217   // Opcode 64 - STZ $00
1754848   // Opcode AA - TAX
1736528   // Opcode 38 - SEC
1734982   // Opcode 80 - BRA
1705602   // Opcode B2 - LDA ($00)
1684508   // Opcode A6 - LDX $00
1595251   // Opcode E5 - SBC $00
1464834   // Opcode 99 - STA $0000,Y
1448692   // Opcode 86 - STX $00
1319863   // Opcode 68 - PLA
1319231   // Opcode 48 - PHA
1309601   // Opcode 2A - ROL A
1253641   // Opcode 06 - ASL $00
1247611   // Opcode 4C - JMP $0000
1236032   // Opcode FD - SBC $0000,X
1236032   // Opcode 7D - ADC $0000,X
1191881   // Opcode 10 - BPL
1046557   // Opcode CA - DEX
1011571   // Opcode 24 - BIT $00
1009079   // Opcode B9 - LDA $0000,Y
974977    // Opcode BC - LDY $0000,X
950454    // Opcode C0 - CPY #$00
935593    // Opcode C4 - CPY $00
903097    // Opcode 7C - JMP ($0000,X)
894852    // Opcode 46 - LSR $00
878941    // Opcode 3A - DEC A
858164    // Opcode C6 - DEC $00
817130    // Opcode 49 - EOR #$00
778117    // Opcode 29 - AND     #$00
757979    // Opcode 9D - STA $0000,X
702290    // Opcode 45 - EOR $00
687380    // Opcode 75 - ADC $00,X
683795    // Opcode 92 - STA ($00)
677427    // Opcode 8A - TXA
667492    // Opcode 04 - TSB $00
628451    // Opcode 05 - ORA $00
617996    // Opcode 5D - EOR $0000,X
552947    // Opcode E9 - SBC #$00
543816    // Opcode D1 - CMP ($00),Y
536041    // Opcode 09 - ORA #$00
480432    // Opcode 2C - BIT $0000
459764    // Opcode 69 - ADC #$00
293523    // Opcode A2 - LDX #$00
235016    // Opcode 16 - ASL $00, X
210251    // Opcode 6A - ROR A
204064    // Opcode BA - TSX
204024    // Opcode 9A - TXS
204004    // Opcode 51 - EOR ($00),Y
157497    // Opcode B5 - LDA $00,X
147231    // Opcode 50 - BVC
133816    // Opcode 1A - INC A
128160    // Opcode DA - PHX
127560    // Opcode FA - PLX
122854    // Opcode 95 - STA $00,X
122187    // Opcode 14 - TRB $00
120385    // Opcode AD - LDA $0000
117664    // Opcode BE - LDX $0000,Y
117664    // Opcode 96 - STX $00,Y
74659     // Opcode 55 - EOR $00,X
69436     // Opcode EA - NOP
47637     // Opcode 4A - LSR A
44433     // Opcode 5A - PHY
44401     // Opcode 7A - PLY
37836     // Opcode 74 - STZ $00,X
11001     // Opcode 8C - STY $0000
8487      // Opcode 28 - PLP
8487      // Opcode 08 - PHP
4106      // Opcode 8D - STA $0000
4070      // Opcode 6C - JMP ($0000)
1203      // Opcode 89 - BIT #$00
570       // Opcode D5 - CMP $00,X
20        // Opcode FE - INC $0000,X
16        // Opcode E4 - CPX $00
1         // Opcode 72 - ADC ($00)

I wonder why they are so different?

Might be down to different versions of BBC Basic....I think I was testing Basic iV as I can see evidence of 65C02 opcodes.

Dave


Top
 Profile  
Reply with quote  
PostPosted: Tue May 24, 2016 6:04 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10986
Location: England
Dave, I looked again at my numbers, and those for ATN are much too similar to CLOCKSP to reflect reality! So I've added a disclaimer to my post. Your numbers, however, are unimpeachable! Thanks for providing them.


Top
 Profile  
Reply with quote  
PostPosted: Tue May 24, 2016 8:10 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10986
Location: England
I've rerun: CLOCKSP first - and this is Basic 2, so slightly different (using a lot more instructions to run the same benchmark too, but much of that is probably the new trig algorithms, not the C02 instructions):
Code:
85 2689203  9.50% sta zp
b1 2160515  7.63% lda (zp),Y
a5 1707739  6.03% lda zp
f0 1496519  5.28% beq rel
d0 1456096  5.14% bne rel
c9 1426672  5.04% cmp imm
26 1293246  4.57% rol zp
a0 1134539  4.01% ldy imm
90 1120458  3.96% bcc rel
b0 1049230  3.70% bcs rel
c8 912211  3.22% iny
20 658364  2.32% jsr abs
60 657747  2.32% rts
88 633329  2.24% dey
65 625248  2.21% adc zp
c5 536264  1.89% cmp zp
05 467375  1.65% ora zp
84 440988  1.56% sty zp
a4 427160  1.51% ldy zp
91 424652  1.50% sta (zp),Y
bd 408993  1.44% lda abs,X
e0 377300  1.33% cpx imm
66 360597  1.27% ror zp
98 279020  0.99% tya
e5 278639  0.98% sbc zp
e6 265205  0.94% inc zp
a9 255888  0.90% lda imm
38 249088  0.88% sec
ca 244209  0.86% dex
06 228841  0.81% asl zp
4c 218808  0.77% jmp abs
e8 212492  0.75% inx
30 212429  0.75% bmi rel
18 190317  0.67% clc
86 188742  0.67% stx zp
aa 185666  0.66% tax
10 183426  0.65% bpl rel
a6 180107  0.64% ldx zp
99 143946  0.51% sta abs,Y


and PRINT 4*ATN(1)
Code:
26   2919 15.95% rol zp
a5   2148 11.74% lda zp
85   2014 11.01% sta zp
d0   1211  6.62% bne rel
e5    781  4.27% sbc zp
90    677  3.70% bcc rel
b1    615  3.36% lda (zp),Y
66    607  3.32% ror zp
b0    521  2.85% bcs rel
06    500  2.73% asl zp
c8    482  2.63% iny
ca    444  2.43% dex
65    441  2.41% adc zp
20    409  2.23% jsr abs
60    392  2.14% rts
10    345  1.89% bpl rel
c9    335  1.83% cmp imm
38    330  1.80% sec
f0    318  1.74% beq rel
c5    318  1.74% cmp zp
05    289  1.58% ora zp
88    180  0.98% dey
4c    155  0.85% jmp abs
a0    154  0.84% ldy imm
46    139  0.76% lsr zp
a9    128  0.70% lda imm
d1     99  0.54% cmp (zp),Y
98     98  0.54% tya
30     97  0.53% bmi rel


Top
 Profile  
Reply with quote  
PostPosted: Tue May 24, 2016 8:54 pm 
Offline
User avatar

Joined: Fri Aug 30, 2002 1:09 am
Posts: 8544
Location: Southern California
It's interesting to see LDA (ZP),Y so high in the lists (near the top), since a few people have thought that the instruction was a waste of logic and instruction-table space. I also see STA, CMP, and EOR (ZP),Y there.

Edit: The topic I was mainly thinking about was primarily about (ZP,X) which Forth uses quite a bit; but Bruce does say there, "In fact, with 16-bit index registers [on the 65816], I'd rather get rid of (ZP), Y."

_________________
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?


Top
 Profile  
Reply with quote  
PostPosted: Tue May 24, 2016 9:11 pm 
Offline
User avatar

Joined: Sun Jun 30, 2013 10:26 pm
Posts: 1949
Location: Sacramento, CA, USA
IMO, (zp),y is the crown jewel of the 65xx ... it never would have enjoyed widespread popularity without it.

Mike B.

[Edit: Of course, if we had 16-bit index registers, (zp),y would lose considerable importance ...]

[Edit 2: Garth edited roughly the same edit simultaneously ... great minds think alike (or something like that).]


Top
 Profile  
Reply with quote  
PostPosted: Wed May 25, 2016 1:20 am 
Offline
User avatar

Joined: Thu May 28, 2009 9:46 pm
Posts: 8507
Location: Midwestern USA
GARTHWILSON wrote:
Edit: The topic I was mainly thinking about was primarily about (ZP,X) which Forth uses quite a bit; but Bruce does say there, "In fact, with 16-bit index registers [on the 65816], I'd rather get rid of (ZP), Y."

(<dp>),Y tends to be a bit less useful in the 65C816 in native mode, as <dp>,X can be substituted if X is 16 bits. Very useful is [<dp>],Y, which can touch all 16 megabytes without having to diddle DB. Also helping is that indexing over a bank boundary works as expected, instead of wrapping.

_________________
x86?  We ain't got no x86.  We don't NEED no stinking x86!


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 18 posts ]  Go to page 1, 2  Next

All times are UTC


Who is online

Users browsing this forum: Google [Bot] and 11 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: