6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Fri Nov 22, 2024 8:14 pm

All times are UTC




Post new topic Reply to topic  [ 16 posts ]  Go to page 1, 2  Next
Author Message
PostPosted: Mon Mar 21, 2016 8:49 pm 
Offline

Joined: Sat May 02, 2015 6:59 pm
Posts: 134
I was looking for a simple 6502 disassembler a couple days ago and couldn't really find what I wanted.
I found a great assortment of Windows based disassemblers, but being a Linux user, I generally keep away from Windows, and while I did find some disassembler code that was happy to compile, the result appeared to target a specific assembler. All's I wanted was a standard disassembly like the 'd' command in the old 6502 monitor programs. Perhaps I didn't look hard enough.

So yesterday I wrote one, MOS 6502 Instruction set only (though an update to 65c02 should be quite simple), and outputs standard disassembly like the old monitors.
Attached along with the source is a compiled version for Windows users (command line).

The source is shorter than some of the 6502 ASM code posted around here, so here it is in all its ugliness.
I really don't do much C coding (next to none) so there might be some silliness in the code that needs pointing out to me.

Code:
/* d6502 v0.4 */

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

    int main(int argc, char **argv) {

        FILE *file;
        char *buffer;
        unsigned long fileLen;
        int address;
        int i;
        int currentbyte;
        int previousbyte;
        int paramcount;
        int addrmode;
        char *opcode;
        char *pad;
        char *pre;
        char *post;

     // Padding for 1,2 & 3 byte instructions
        char *padding[3] = {"        ","    ",""};

     // 57 Instructions + Undefined ("???")
        char *instruction[58] = {
     //       0     1     2     3     4     5     6     7     8     9
            "ADC","AND","ASL","BCC","BCS","BEQ","BIT","BMI","BNE","BPL", // 0
            "BRK","BVC","BVS","CLC","CLD","CLI","CLV","CMP","CPX","CPY", // 1
            "DEC","DEX","DEY","EOR","INC","INX","INY","JMP","JSR","LDA", // 2
            "LDX","LDY","LSR","NOP","ORA","PHA","PHP","PLA","PLP","ROL", // 3
            "ROR","ROT","RTI","RTS","SBC","SEC","SED","SEI","STA","STX", // 4
            "STY","TAX","TAY","TSX","TXA","TXS","TYA","???"};            // 5

     // This is a lookup of the text formating required for mode output, plus one entry to distinguish relative mode
        char *modes[9][2]={{"",""},{"#",""},{"",",X"},{"",",Y"},{"(",",X)"},{"(","),Y"},{"(",")"},{"A",""},{"",""}};

     // Opcode Properties for 256 opcodes {length_in_bytes, mnemonic_lookup, mode_chars_lookup}
        int opcode_props[256][3] = {
     //         0        1        2        3        4        5        6        7        8        9        A        B        C        D        E        F
     //     ******** -------- ******** -------- ******** -------- ******** -------- ******** -------- ******** -------- ******** -------- ******** --------
            {1,10,0},{2,34,4},{1,57,0},{1,57,0},{1,57,0},{2,34,0},{2,2,0}, {1,57,0},{1,36,0},{2,34,1},{1,2,7}, {1,57,0},{1,57,0},{3,34,0},{3,2,0}, {1,57,0}, // 0
            {2,9,8}, {2,34,5},{1,57,0},{1,57,0},{1,57,0},{2,34,2},{2,2,2}, {1,57,0},{1,13,0},{3,34,3},{1,57,0},{1,57,0},{1,57,0},{3,34,2},{3,2,2}, {1,57,0}, // 1
            {3,28,0},{2,1,4}, {1,57,0},{1,57,0},{2,6,0}, {2,1,0}, {2,39,0},{1,57,0},{1,38,0},{2,1,1}, {1,39,7},{1,57,0},{3,6,0}, {3,1,0}, {3,39,0},{1,57,0}, // 2
            {2,7,8}, {2,1,5}, {1,57,0},{1,57,0},{1,57,0},{2,1,2}, {2,39,2},{1,57,0},{1,45,0},{3,1,3}, {1,57,0},{1,57,0},{1,57,0},{3,1,2}, {3,39,2},{1,57,0}, // 3
            {1,42,0},{2,23,4},{1,57,0},{1,57,0},{1,57,0},{2,23,0},{2,32,0},{1,57,0},{1,35,0},{2,23,1},{1,32,7},{1,57,0},{3,27,0},{3,23,0},{3,32,0},{1,57,0}, // 4
            {2,11,8},{2,23,5},{1,57,0},{1,57,0},{1,57,0},{2,23,2},{2,32,2},{1,57,0},{1,15,0},{3,23,3},{1,57,0},{1,57,0},{1,57,0},{3,23,2},{3,32,2},{1,57,0}, // 5
            {1,43,0},{2,0,4}, {1,57,0},{1,57,0},{1,57,0},{2,0,0}, {2,40,0},{1,57,0},{1,37,0},{2,0,1}, {1,40,7},{1,57,0},{3,27,6},{3,0,0}, {3,40,0},{1,57,0}, // 6
            {2,12,8},{2,0,5}, {1,57,0},{1,57,0},{1,57,0},{2,0,2}, {2,40,2},{1,57,0},{1,47,0},{3,0,3}, {1,57,0},{1,57,0},{1,57,0},{3,0,2}, {3,40,2},{1,57,0}, // 7
            {1,57,0},{2,48,4},{1,57,0},{1,57,0},{2,50,0},{2,48,0},{2,49,0},{1,57,0},{1,22,0},{1,57,0},{1,54,0},{1,57,0},{3,50,0},{3,48,0},{3,49,0},{1,57,0}, // 8
            {2,3,8}, {2,48,5},{1,57,0},{1,57,0},{2,50,2},{2,48,2},{2,49,3},{1,57,0},{1,56,0},{3,48,3},{1,55,0},{1,57,0},{1,57,0},{3,48,2},{1,57,0},{1,57,0}, // 9
            {2,31,1},{2,29,4},{2,30,1},{1,57,0},{2,31,0},{2,29,0},{2,30,0},{1,57,0},{1,52,0},{2,29,1},{1,51,0},{1,57,0},{3,31,0},{3,29,0},{3,30,0},{1,57,0}, // A
            {2,4,8}, {2,29,5},{1,57,0},{1,57,0},{2,31,2},{2,29,2},{2,30,3},{1,57,0},{1,16,0},{3,29,3},{1,53,0},{1,57,0},{3,31,2},{3,29,2},{3,30,3},{1,57,0}, // B
            {2,19,1},{2,17,4},{1,57,0},{1,57,0},{2,19,0},{2,17,0},{2,20,0},{1,57,0},{1,26,0},{2,17,1},{1,21,0},{1,57,0},{3,19,0},{3,17,0},{3,20,0},{1,57,0}, // C
            {2,8,8}, {2,17,5},{1,57,0},{1,57,0},{1,57,0},{2,17,2},{2,20,2},{1,57,0},{1,14,0},{3,17,3},{1,57,0},{1,57,0},{1,57,0},{3,17,2},{3,20,2},{1,57,0}, // D
            {2,18,1},{2,44,4},{1,57,0},{1,57,0},{2,18,0},{2,44,0},{2,24,0},{1,57,0},{1,25,0},{2,44,1},{1,33,0},{1,57,0},{3,18,0},{3,44,0},{3,24,0},{1,57,0}, // E
            {2,5,8}, {2,44,5},{1,57,0},{1,57,0},{1,57,0},{2,44,2},{2,24,2},{1,57,0},{1,46,0},{3,44,3},{1,57,0},{1,57,0},{1,57,0},{3,44,2},{3,24,2},{1,57,0}  // F
        };

        if (argc < 2) {                                                 //If no parameters given, display usage instructions and exit.
            fprintf(stderr, "Usage: %s filename address\n\n", argv[0]);
            fprintf(stderr, "Example: %s dump.rom E000\n", argv[0]);
            exit(1);
        }

        if (argc == 3) {
            address = strtol(argv[2], NULL, 16);                        //If second parameter, accept it as HEX address for start of dissasembly.
        }

        file = fopen(argv[1], "rb");                                    //Open file
        if (!file) {
            fprintf(stderr, "Can't open file %s", argv[1]);             //Error if file not found
            exit(1);
        }

        fseek(file, 0, SEEK_END);                                       //Seek to end of file to find length
        fileLen = ftell(file);
        fseek(file, 0, SEEK_SET);                                       //And back to the start

        buffer = (char * ) malloc(fileLen + 1);                         //Set up file buffer

        if (!buffer) {                                                  //If memory allocation error...
            fprintf(stderr, "Memory allocation error!");                //...display message...
            fclose(file);                                               //...and close file
            exit(1);
        }

        fread(buffer, fileLen, 1, file);                                //Read entire file into buffer and...
        fclose(file);                                                   //...close file

        paramcount = 0;
        printf("                  * = $%04X \n", address);              //Display org address

        for (i = 0; i < fileLen; ++i) {                                 //Start proccessing loop.
            previousbyte = currentbyte;
            currentbyte = ((unsigned char * ) buffer)[i];
            if (paramcount == 0) {
                printf("$%04X   ", address);                            //Display current address at beginning of line
                paramcount = opcode_props[currentbyte][0];              //Get instruction length
                opcode = instruction[opcode_props[currentbyte][1]];     //Get opcode name
                addrmode = opcode_props[currentbyte][2];                //Get info required to display addressing mode
                pre = modes[addrmode][0];                               //Look up pre-operand formatting text
                post = modes[addrmode][1];                              //Look up post-operand formatting text
                pad = padding[(paramcount - 1)];                        //Calculate correct padding for output alignment
                address = address + paramcount;                         //Increment address
            }
            if (paramcount != 0)                                        //Keep track of possition within instruction
                paramcount = paramcount - 1;
            printf("$%02X ", currentbyte);                              //Display the current byte in HEX
            if (paramcount == 0) {
                printf(" %s %s %s", pad, opcode, pre);                  //Pad text, display instruction name and pre-operand chars
                if(!strcmp (pad,"    " )) {                             //Check if single operand instruction
                    if (addrmode != 8) {                                //If not using relative addressing ...
                        printf("$%02X", currentbyte);                   //...display operand
                    } else {                                            //Addressing mode is relative...
                        printf("$%04X", (address + ((currentbyte < 128) ? currentbyte : currentbyte - 256))); //...display relative address.
                    }
                }
                if(!strcmp (pad,"" ))                                   //Check if two operand instruction and if so...
                    printf("$%02X%02X", currentbyte, previousbyte);     //...display operand
                printf("%s\n", post);                                   //Display post-operand chars
            }
        }
        printf("$%04X                .END\n", address);                 //Add .END directive to end of output
        free(buffer);                                                   //Return buffer memory to the system
        return 0;                                                       //All done, exit to the OS
    }


Attachments:
d6502.zip [6.7 KiB]
Downloaded 709 times


Last edited by Cray Ze on Tue Mar 22, 2016 6:36 pm, edited 9 times in total.
Top
 Profile  
Reply with quote  
PostPosted: Mon Mar 21, 2016 9:28 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10985
Location: England
That's quick work, and the result is certainly short and sweet! Seems to be working too - one way to test it would be to find a way to reassemble the output, or compare it with a listing from assembling the input, and using a test file with all the instructions used.

I got a couple of warnings when compiling, not sure if there was any ill effect:
Code:
warning: result of comparison against a string literal is unspecified (use strncmp instead)

It doesn't like comparing pad with a literal string on lines 112 and 119.


Top
 Profile  
Reply with quote  
PostPosted: Mon Mar 21, 2016 10:35 pm 
Offline

Joined: Sat May 02, 2015 6:59 pm
Posts: 134
BigEd wrote:
That's quick work, and the result is certainly short and sweet! Seems to be working too - one way to test it would be to find a way to reassemble the output, or compare it with a listing from assembling the input, and using a test file with all the instructions used.

I'm yet to find an assembler that isn't confused by the address and instruction bytes being present. Maybe not looking hard enough again, might have to write one.

BigEd wrote:
I got a couple of warnings when compiling, not sure if there was any ill effect:
Code:
warning: result of comparison against a string literal is unspecified (use strncmp instead)

It doesn't like comparing pad with a literal string on lines 112 and 119.

Thanks for that, interestingly, gcc didn't give me any warnings at all but it should be fixed now. New archive attached (also fixed the code block in the initial post).


Attachments:
d6502.zip [6.75 KiB]
Downloaded 360 times
Top
 Profile  
Reply with quote  
PostPosted: Mon Mar 21, 2016 10:51 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10985
Location: England
I ran a comparison against the run6502 disassembler, using Klaus' test suite as a testcase, and found just one important difference:
6A ROT A
should be
6A ROR A

One small difference is that branch target addresses are not zero-padded to be always 4 hex digits, and a second is that run6502 puts a $ in front of all hex constants. Those are cosmetic, of course.

Good job!


Top
 Profile  
Reply with quote  
PostPosted: Tue Mar 22, 2016 12:07 am 
Offline

Joined: Sat May 02, 2015 6:59 pm
Posts: 134
BigEd wrote:
I ran a comparison against the run6502 disassembler, using Klaus' test suite as a testcase, and found just one important difference:
6A ROT A
should be
6A ROR A

Ooops, the entry at 6A in the opcode table should be {1,40,7} and not {1,41,7} - fixed in initial post again.

BigEd wrote:
One small difference is that branch target addresses are not zero-padded to be always 4 hex digits, and a second is that run6502 puts a $ in front of all hex constants. Those are cosmetic, of course.

The lack of zero padding was a bug, fixed now as well. Not sure on the $ though, it seems to vary across monitors / disassemblers, lots of the old-school monitor programs left them off everything but mnemonics, likely due to limited screen width. I'll have to either settle on a format or provide some options switches.


Attachments:
d6502.zip [6.76 KiB]
Downloaded 395 times
Top
 Profile  
Reply with quote  
PostPosted: Tue Mar 22, 2016 8:13 am 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10985
Location: England
Thanks for the quick fix! I suspect a disassembler can grow and grow, which isn't necessarily the best thing. A quick and minimal one which is easy to tweak could be very useful. (I'm sure there are more than the few presently listed at http://6502.org/tools/asm/, in particular there's at least one which tries to find all the reachable code in an image, and emits labels for target addresses.)


Top
 Profile  
Reply with quote  
PostPosted: Tue Mar 22, 2016 9:44 am 
Offline

Joined: Sat May 02, 2015 6:59 pm
Posts: 134
There are some enhancements that won't cause too much growth, I'll update it with some options to tweak the formatting along with 65c02 (selectable) support.
I've already been thinking about various ways to walk the code and generate labels, though I'll leave that to the next round.


Top
 Profile  
Reply with quote  
PostPosted: Tue Mar 22, 2016 9:45 am 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10985
Location: England
Could you perhaps leave the simple minimal one as-is and attach a second fully featured one?


Top
 Profile  
Reply with quote  
PostPosted: Tue Mar 22, 2016 10:48 am 
Offline

Joined: Sat May 02, 2015 6:59 pm
Posts: 134
Yes, I can leave this one as is, or make a final update with the $ turned on if it's preferable (screen width not being an issue these days).
It's at a great point for tweaking now as it's still small enough to easily understand.


Top
 Profile  
Reply with quote  
PostPosted: Tue Mar 22, 2016 3:27 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10985
Location: England
It might be good to put the dollar in - after all it's easy for the user to strip it out, in an editor or using the command line, whereas inserting it after the fact would be much more tricky.


Top
 Profile  
Reply with quote  
PostPosted: Tue Mar 22, 2016 6:47 pm 
Offline

Joined: Sat May 02, 2015 6:59 pm
Posts: 134
BigEd wrote:
It might be good to put the dollar in - after all it's easy for the user to strip it out, in an editor or using the command line, whereas inserting it after the fact would be much more tricky.

Okay, dollar symbols enabled, new archive attached and initial post updated.


Attachments:
d6502.zip [6.76 KiB]
Downloaded 476 times
Top
 Profile  
Reply with quote  
PostPosted: Tue Mar 22, 2016 6:49 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10985
Location: England
Thanks!


Top
 Profile  
Reply with quote  
PostPosted: Tue Mar 22, 2016 10:46 pm 
Offline

Joined: Sun Jun 29, 2014 5:42 am
Posts: 352
Here's another (possibly) even more minimal one:
https://github.com/hoglet67/AtomBusMon/ ... /dis6502.c

This one had to fit inside the block RAM of a XC3S250E FPGA (it's part of ICE T65).

The original code was part of Tom Walker's Atomulator (Acorn Atom Emulator).

Dave


Top
 Profile  
Reply with quote  
PostPosted: Wed Mar 23, 2016 5:48 am 
Offline

Joined: Tue Jul 24, 2012 2:27 am
Posts: 679
radare can do commandline binary file disassembly.

rasm2 -a 6502 -D -B -o starting-address -f filename

Lowercase -d to get just the instructions.

_________________
WFDis Interactive 6502 Disassembler
AcheronVM: A Reconfigurable 16-bit Virtual CPU for the 6502 Microprocessor


Top
 Profile  
Reply with quote  
PostPosted: Wed Mar 23, 2016 6:28 am 
Offline

Joined: Sat May 02, 2015 6:59 pm
Posts: 134
hoglet wrote:
Here's another (possibly) even more minimal one:
https://github.com/hoglet67/AtomBusMon/ ... /dis6502.c
Wow, nice, that one does look very small. Thanks for the link, a nice project you have there.
It might be possible to reduce it down even further, I'll have a play later and see what's possible.


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 16 posts ]  Go to page 1, 2  Next

All times are UTC


Who is online

Users browsing this forum: No registered users and 14 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: