pdragon wrote:
if you use an an 8-bit analogue of the djb2 hash where you keep a running 'sum' k and for each character circular rotate k 5 bits left and then xor with the next character, you get a single byte key which is unique over these 70 three letter mnemonics.
Are you sure about that? I tried, and get collisions for a number of them (for example, $40 for LSR, PHA, and TXS).
Collisions are OK though. I experimented with a very simple 6502-friendly hash (shift current hash left one bit, xor next character, 128 entry table). If there's a collision while building the table I bump the existing item to the next slot. Lookups are done by comparing the string in each slot until it is either found or an empty slot is found. The worst case is INX, which hash a hash of $60 but ends up in slot $67. Items that aren't in the table can take a lot longer, but they're errors so their speed is less important.
This would help an assembler, not a disassembler, and it helps speed rather than size, so would perhaps be more suited to its own thread if anyone wants to continue.
That disassembler is incredible. The operand packing is particularly genius, and the code is surprisingly readable for something so intricate.