6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Sun Sep 22, 2024 6:38 am

All times are UTC




Post new topic Reply to topic  [ 40 posts ]  Go to page 1, 2, 3  Next
Author Message
PostPosted: Tue Aug 13, 2024 1:15 pm 
Offline

Joined: Tue Sep 26, 2023 11:09 am
Posts: 97
What's the smallest disassembler that runs natively on the 65c02? No external dependencies other than a putchar routine, and output similar to below.

wozmon took one page, maybe you could do it in two?

Code:
840e   74 01      stz $01,x
8410   20 9f 93   jsr $939f
8413   20 5d 9b   jsr $9b5d
8416   e8         inx
8417   e8         inx
8418   b5 fe      lda $fe,x
841a   15 ff      ora $ff,x
841c   f0 1a      beq $8438
841e   24 1c      bit $1c
8420   10 06      bpl $8428
8422   a9 80      lda #$80
8424   04 1c      tsb $1c
8426   80 14      bra $843c


Top
 Profile  
Reply with quote  
PostPosted: Tue Aug 13, 2024 9:56 pm 
Offline
User avatar

Joined: Sun Jun 30, 2013 10:26 pm
Posts: 1948
Location: Sacramento, CA, USA
pdragon wrote:
wozmon took one page, maybe you could do it in two?

Woz did a native disassembler for the original 6502 in 476 bytes, which leaves only 36 bytes available for the additional 'c02 instructions. I'm pretty sure that most mortals would struggle and ultimately fail to fit a native 'c02 disassembler in 512 bytes. I think I could do one in less than 600, but that's just idle speculation, and I'm not really prepared for the attempt at present.

_________________
Got a kilobyte lying fallow in your 65xx's memory map? Sprinkle some VTL02C on it and see how it grows on you!

Mike B. (about me) (learning how to github)


Top
 Profile  
Reply with quote  
PostPosted: Tue Aug 13, 2024 10:09 pm 
Offline

Joined: Tue Sep 26, 2023 11:09 am
Posts: 97
thanks - i see a couple of other related threads at viewtopic.php?f=2&t=5884 and viewtopic.php?f=2&t=4187


Top
 Profile  
Reply with quote  
PostPosted: Wed Aug 14, 2024 1:23 pm 
Offline

Joined: Tue Sep 03, 2002 12:58 pm
Posts: 325
Ignoring the digit at the end of instructions like RMB0, the 65C02 has 70 different instruction names. Storing them directly would require 210 bytes. The obvious optimisation is to pack the three letters for an instruction into two bytes, as only five bits are needed for each. That takes it down to 140. Is it possible to do better?

Looking at just the first letter of the names, there are 15 possible characters: A, B, C, D, E, I, J, L, N, O, P, R, S, T, and W. The other two letters come in 64 unique pairs.

It is possible to group the first letters into four sets that have 16 second/third letter pairs each:
    C, I, J, L, N, W can be followed by AI, DA, DX, DY, LC, LD, LI, LV, MP, NC, NX, NY, OP, PX, PY, SR
    D, E, R, S can be followed by BC, EC, ED, EI, EX, EY, MB, OL, OR, TA, TI, TP, TS, TX, TY, TZ
    A, B, O can be followed by BR, BS, CC, CS, DC, EQ, IT, MI, ND, NE, PL, RA, RK, SL, VC, VS
    P, T can be followed by AX, AY, HA, HP, HX, HY, LA, LP, LX, LY, RB, SB, SX, XA, XS, YA

So it would seem that an instruction name can be described in only 8 bits - first letter, then second/third pair. But it takes a number of tables to expand this encoding, and the whole ends up larger than the original.

Still, it's a neat coincidence, and it would be nice if it could be exploited somehow.

(for the record, there are 18 unique second letters with 59 first/third pairs, and 17 third letters with 39 first/second pairs)


Top
 Profile  
Reply with quote  
PostPosted: Sat Aug 17, 2024 10:39 pm 
Offline

Joined: Tue Nov 10, 2015 5:46 am
Posts: 228
Location: Kent, UK
I thought it might be fun to take a crack at this.

64 bytes to describe the _text_ of the addressing modes (4-bits per character, 6 characters (3 bytes) per addressing mode)
128 bytes to map an opcode to an addressing mode (4-bits per opcode, 16 addressing modes)
16 bytes to map an addressing mode to the number of operand bytes
140 bytes to encode the instruction names (packed 5-bits per char, so 2-bytes per instruction; 70 instructions)
149 bytes to map an opcode to an instruction name (various mask/compare tables)

For a total of 497 bytes. That's all data, no code.

I looked at various different ways to encode strings, and to map opcodes to names and addressing modes... but I kept returning to the numbers above as being the most optimal. There were other data encoding ideas, but the code blew out and negated the scheme.

I have another 419 bytes of (fragments of unfinished code) to go along with this, too... Suffice to say my own attempt would be more like a 1kB effort than the target 512-bytes.

I'd be seriously impressed if it could be done in under 600 bytes, but clever people can do clever things...

EDIT: The 149 bytes of map to decode an opcode to an instruction name used more code bytes than a simple 256-entry lookup table, so with that replaced my (now working) implementation is 918 bytes, total.
It uses only an external 'outchar' function, and its output looks like this:
Code:
f178: 28       | plp
f179: 2c 23 2d | bit  $2d23
f17c: 62       | .byte $62
f17d: 77 29    | rmb7 $29
f17f: 78       | sei
f180: 61 79    | adc  ($79,x)
f182: 2d 72 62 | and  $6272
f185: 77 00    | rmb7 $00
f187: 00       | brk

It's nothing others haven't done before, and completely fails to meet the 512-byte challenge. Still, it was fun to write after so long having not touched 6502 assembly.


Top
 Profile  
Reply with quote  
PostPosted: Wed Aug 21, 2024 6:58 am 
Offline
User avatar

Joined: Sun Jun 30, 2013 10:26 pm
Posts: 1948
Location: Sacramento, CA, USA
sark02 wrote:
It's nothing others haven't done before, and completely fails to meet the 512-byte challenge. Still, it was fun to write after so long having not touched 6502 assembly.


Nicely done! Please share, if you don't mind.

_________________
Got a kilobyte lying fallow in your 65xx's memory map? Sprinkle some VTL02C on it and see how it grows on you!

Mike B. (about me) (learning how to github)


Top
 Profile  
Reply with quote  
PostPosted: Wed Aug 21, 2024 4:46 pm 
Offline

Joined: Tue Nov 10, 2015 5:46 am
Posts: 228
Location: Kent, UK
barrym95838 wrote:
Nicely done! Please share, if you don't mind.

Thanks. Here you go: https://github.com/therealsark02/dasm6502/blob/main/dasm.s
Final size is 924 bytes.


Top
 Profile  
Reply with quote  
PostPosted: Wed Aug 21, 2024 4:54 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10938
Location: England
Thanks!


Top
 Profile  
Reply with quote  
PostPosted: Thu Aug 22, 2024 10:26 pm 
Offline

Joined: Tue Sep 26, 2023 11:09 am
Posts: 97
Great to see some interest!

Here's my work in progress.

tl;dr: 540 bytes code+tables if we can treat the extension bit operators RMBn, SMBn, BBRn, and BBSn as NOP with equivalent size/address mode. 604 bytes if you want those disassembled too.

RMBn and friends are a complete pain since they follow a different set of patterns than all the other opcodes. I haven't tried to optimize that part too much yet but getting to 512 bytes does seems tough :-)

The breakdown for the 540 byte version is 285 bytes of code, 192 bytes of mnemonic data (labels, lookups and indices) and 63 bytes of address mode lookup and formatting data. There'd be a nice symmetry if only I could lose 30 bytes of code... Suggestions welcome!

Here's some sample output from self-disassembly:

Code:
0200   A4 00       LDY $00
0202   A5 01       LDA $01
0204   20 E5 02    JSR $02E5
0207   20 13 03    JSR $0313
020A   B2 00       LDA ($00)
020C   A2 20       LDX #$20
020E   DD F0 03    CMP $03F0,X
0211   F0 08       BEQ $021B
0213   E8          INX
0214   E0 2C       CPX #$2C
0216   D0 F6       BNE $020E
0218   29 1F       AND #$1F
021A   AA          TAX
021B   8A          TXA
021C   4A          LSR
021D   AA          TAX
021E   BD FA 03    LDA $03FA,X
0221   90 04       BCC $0227
0223   4A          LSR
0224   4A          LSR
0225   4A          LSR
0226   4A          LSR
0227   29 0F       AND #$0F
0229   AA          TAX



I don't think there's anything too novel here: I packed 3:2 for mnemonic labels, and pleased to learn that I'd reinvented something similar to Baum & Woz's operand formatting template with a bitmask to include characters from a string. I'm not sure why they split it into two strings of pairs, but I immediately borrowed their idea of inserting the operand at a fixed location and not representing it explicitly in the template.

For mapping opcodes to mnemonics I made extensive use of this opcode layout explainer. It led to a really efficient scheme of matching bitmasks to group opcode slices based on their five least significant bits, along with a few duplicated mnemonics and some special cases in a lookup table that don't fit the pattern. This seems different than the largely code-based slicing done by B&W.

I've tried a bunch of different schemes for address mode decoding, and while the current one works it still feels a bit of a mess. Only 63 bytes of data but various special cases in the code so maybe still some juice to squeeze.

Overall my experience has been the same as sark02 where any smarter/smaller data encoding almost inevitably led to a bigger assembly increase for decoding on the other side, like trying to squash a balloon. Learning a lot about the benefit of single-path calculations vs conditionals, and lookups vs code but no doubt many improvements still to be made. Definitely feels like a practical lesson in Kolmogorov complexity!


Top
 Profile  
Reply with quote  
PostPosted: Thu Aug 22, 2024 10:49 pm 
Offline
User avatar

Joined: Thu May 28, 2009 9:46 pm
Posts: 8390
Location: Midwestern USA
pdragon wrote:
RMBn and friends are a complete pain since they follow a different set of patterns than all the other opcodes.

Not necessarily, if the syntax for those instructions is correctly implemented.  For example...

Code:
RMB 2,$80

...is the syntax for clearing bit 2 at zero-page location $80.  Similarly...

Code:
BBS 4,$90,$1234

...is the syntax for branching if bit 4 at zero-page location $90 is set.

Aberrations, such as...

Code:
BBS4 $90,$1234

...are incorrect, as 4 is an operand that represents the particular entity being tested—the instruction mnemonic is BBS.

I should note that the Rockwell data sheet contradicts itself on this; the mnemonic list says one thing and the opcode table says another.  In adherence to the original (and therefore authoritative) MOS Technology convention, which was adopted from the MC6800 assembly language, all mnemonics are three alpha characters.  Hence something like BBS4 would be wrong.

Attachment:
File comment: Rockwell 65C02 Data Sheet (1984)
65c02_1984_rockwell.pdf [13.12 MiB]
Downloaded 5 times

_________________
x86?  We ain't got no x86.  We don't NEED no stinking x86!


Top
 Profile  
Reply with quote  
PostPosted: Fri Aug 23, 2024 12:32 am 
Offline

Joined: Tue Sep 26, 2023 11:09 am
Posts: 97
definitely makes more sense to think of n as an implied operand, that's essentially how i disassemble it.

They have a bit pattern like xaaby111 where xy selects between RMB, BBR, SMB and BBS and aab is the bit index operand.

By painful I more meant they are exceptions to all the other rules: no other opcodes use two (let alone three) operands, the addressing mode is unique, and the grouping doesn't align with how the other opcodes are grouped (you'd otherwise expect [i]xyabb[i/] tho I assume the choice was to align with the units hex digit?). Just means lots of special case code I guess.


Top
 Profile  
Reply with quote  
PostPosted: Fri Aug 23, 2024 12:52 am 
Offline
User avatar

Joined: Thu May 28, 2009 9:46 pm
Posts: 8390
Location: Midwestern USA
pdragon wrote:
By painful I more meant they are exceptions to all the other rules: no other opcodes use two (let alone three) operands, the addressing mode is unique, and the grouping doesn't align with how the other opcodes are grouped (you'd otherwise expect xyabb tho I assume the choice was to align with the units hex digit?). Just means lots of special case code I guess.

Adding insult to injury, the Rockwell extensions are zero page only, which limits their utility.  They were developed to facilitate the use of the 65C02 in modem chipsets, which usually made chip registers visible in zero page in order to (theoretically) improve performance.

Back when I did 65C02 development, I never used RMB and SMB.  I found TRB and TSB generally more useful, as those two can manipulate more than one bit at a time, aren’t confined to zero page and can also tell you the state of the affected memory cell before it is modified.  That Bill Mensch decided to not implement the Rockwell extensions in the 65C816 (their opcodes are, instead, 24-bit absolute addressing instructions), but did keep TRB and TSB, suggests what he thought of BBR, BBS et al.

Incidentally, the 65C816's MVN and MVP instructions take two comma-separated operands.  When I wrote Supermon 816, I had to make an exception to normal operand parsing to accommodate those two.

_________________
x86?  We ain't got no x86.  We don't NEED no stinking x86!


Top
 Profile  
Reply with quote  
PostPosted: Fri Aug 23, 2024 5:03 am 
Offline
User avatar

Joined: Sun Jun 30, 2013 10:26 pm
Posts: 1948
Location: Sacramento, CA, USA
pdragon wrote:
I've tried a bunch of different schemes for address mode decoding, and while the current one works it still feels a bit of a mess. Only 63 bytes of data but various special cases in the code so maybe still some juice to squeeze.

I think Woz would be proud. Glancing through your work, you appear to have equaled or surpassed my code-golfing skill level.

_________________
Got a kilobyte lying fallow in your 65xx's memory map? Sprinkle some VTL02C on it and see how it grows on you!

Mike B. (about me) (learning how to github)


Top
 Profile  
Reply with quote  
PostPosted: Fri Aug 23, 2024 9:18 am 
Offline

Joined: Tue Nov 10, 2015 5:46 am
Posts: 228
Location: Kent, UK
pdragon wrote:
Here's my work in progress.

tl;dr: 540 bytes code+tables if we can treat the extension bit operators RMBn, SMBn, BBRn, and BBSn as NOP with equivalent size/address mode. 604 bytes if you want those disassembled too.
Fantastic! Best to include all the opcodes from the WDC 65C02 manual, for an apples-to-apples comparison, but you're trouncing me decisively. Nice work!


Top
 Profile  
Reply with quote  
PostPosted: Fri Aug 23, 2024 3:08 pm 
Offline

Joined: Mon Jan 19, 2004 12:49 pm
Posts: 844
Location: Potsdam, DE
Just a thought: does a base-36 output routine take less space than the triple five-bit approach?

Neil


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 40 posts ]  Go to page 1, 2, 3  Next

All times are UTC


Who is online

Users browsing this forum: No registered users and 15 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: