6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Sun Jun 16, 2024 6:25 pm

All times are UTC




Post new topic Reply to topic  [ 37 posts ]  Go to page 1, 2, 3  Next
Author Message
PostPosted: Sat Feb 03, 2018 7:48 pm 
Offline

Joined: Thu Feb 10, 2011 3:14 am
Posts: 79
Yes, it's yet another language/compiler for the 6502 of dubious usefulness.

This has been a pet project of mine for quite a few years. I finally doubled down and started seriously developing it last summer, and got the code into a state that is hopefully useable by someone other than myself and put in on GitHub this week.

You can find it at https://github.com/RevCurtisP/C02.

The alpha version of the compiler is written in C, but as the code is currently a hacked together mess, I plan to do a complete rewrite, which I may do in C02, so it can be a native compiler.

I've made a number of design decisions that cause this language to diverge from C in some unusual ways, some of which I am questioning, so I plan to ask for advice and/or opinions.

I plan to post some example code in the next couple days so that you don't have to read through all the documentation.


Top
 Profile  
Reply with quote  
PostPosted: Sat Feb 03, 2018 8:17 pm 
Offline

Joined: Wed Jan 08, 2014 3:31 pm
Posts: 578
Thanks for posting. I will take a look.

A self-hosted version of this would be really neat.


Top
 Profile  
Reply with quote  
PostPosted: Sat Feb 03, 2018 9:23 pm 
Offline

Joined: Sat Jun 04, 2016 10:22 pm
Posts: 483
Location: Australia
This looks like an interesting project. Something like C, and in the spirit of C, but targeted at 8-bit devices instead of 16 or wider.
I agree with Martin_H. Being able to have it self-host could be very useful.

It might be interesting to test the efficiency of its compiled code against another C compiler that targets the 6502. Cc65 is one, I think.


Top
 Profile  
Reply with quote  
PostPosted: Sun Feb 04, 2018 1:17 am 
Offline

Joined: Thu Feb 10, 2011 3:14 am
Posts: 79
DerTrueForce wrote:
I agree with Martin_H. Being able to have it self-host could be very useful.

The I will definitely proceed with this as a goal.

DerTrueForce wrote:
It might be interesting to test the efficiency of its compiled code against another C compiler that targets the 6502. Cc65 is one, I think.

I've never actually used cc65. I'm downloading it now to generate some sample source code.


Top
 Profile  
Reply with quote  
PostPosted: Sun Feb 04, 2018 2:09 am 
Offline

Joined: Thu Feb 10, 2011 3:14 am
Posts: 79
I compiled a Hello World program in both CC65 and C02, targeting the VIC 20.

The C code used with CC65 compiled to 777 bytes total.
Code:
#include <stdio.h>
#include <stdlib.h>
const char text[] = "hello world!\n";
int main (void)
{
    puts(text);
    return EXIT_SUCCESS;
}

The C02 code compiled to 188 bytes.
Code:
#include "include/vic20.h02"
#include <stdio.h02>
char text = "HELLO WORLD!";
main:
    putln(&text);
    goto exit;


Top
 Profile  
Reply with quote  
PostPosted: Sun Feb 04, 2018 2:17 am 
Offline

Joined: Thu Feb 10, 2011 3:14 am
Posts: 79
This was an incredibly trivial example and the generated code was actually very similar.

CC65 .s file
Code:
_text:
   .byte   $48,$45,$4C,$4C,$4F,$20,$57,$4F,$52,$4C,$44,$21,$0D,$00

   lda     #<(_text)
   ldx     #>(_text)
   jsr     _puts
   ldx     #$00
   txa
   rts

C02 .asm file
Code:
MAIN:   LDY #>TEXT       ;MAIN: PUTLN(&TEXT
        LDX #<TEXT       
        JSR PUTLN        ;);
        JMP EXIT         ;GOTO EXIT;
TEXT:   DC  $48,$45,$4C,$4C,$4F,$20,$57,$4F,$52,$4C,$44,$21,$00

so in this case the difference must be in the included library.


Top
 Profile  
Reply with quote  
PostPosted: Sun Feb 04, 2018 2:48 am 
Offline

Joined: Thu Feb 10, 2011 3:14 am
Posts: 79
Once of the goals of C02 is to be completely system agnostic and independent of any specific system libraries.

For example, I've created a header file that contains just a VIC 20 Basic stub and either hooks or aliases functions to the routines built into the ROM.

This code compiled to a total of 60 bytes.
Code:
#include "include/vic20bas.h02"
char text = "HELLO WORLD!";
main:
    strout(&text);
    chrout($0D);
    goto exit;

This is what the generated code looks like.
Code:
CHROUT  EQU $FFD2 ;Output Character to Channel

;Machine Language Basic Stub
        ORG $1001              ;Start
BASIC:  DC  $0C, $10           ; Pointer to Next Line (4108)
        DC  $00, $00           ; Line Number (0)
        DC  $9E                ; SYS
        DC  $20                ; ' '
        DC  $34, $31, $31 ,$30 ; "4110"
        DC  $00                ;End of Line Marker
        DC  $00, $00           ;End of Basic Program

START:  TSX         ;Get Stack Pointer
        STX STKPTR  ;and Save for Exit
        JMP MAIN    ;Execute Program

STKPTR: DS 1

EXIT:   LDX STKPTR  ;Retrieve Saved Stack Pointer
        TXS         ;and Restore It
        RTS         ;Return to BASIC

;System Routine Hooks
STROUT: TXA         ;Move Low Byte from X to A
        JMP $CB1E   ;Print String at Address in Y and A

MAIN:   LDY #>TEXT       ;MAIN: STROUT(&TEXT
        LDX #<TEXT       
        JSR STROUT       ;);
        LDA #$0D         ;CHROUT($0D
        JSR CHROUT       ;);
        JMP EXIT         ;GOTO EXIT;
TEXT:   DC  $48,$45,$4C,$4C,$4F,$20,$57,$4F,$52,$4C,$44,$21,$00


Top
 Profile  
Reply with quote  
PostPosted: Sun Feb 04, 2018 4:30 pm 
Offline
User avatar

Joined: Mon May 12, 2014 6:18 pm
Posts: 365
This sounds really neat! Could you see how compiled code compares when you do loops and calculations? How about something along these lines:
Code:
int i,j,result;
for (i=0;i<10;i++)
{
   for (j=0;j<10;j++)
   {
      result=(i*10+j)*9/5+32;
      printf("%d\n",result);
   }
}


Top
 Profile  
Reply with quote  
PostPosted: Sun Feb 04, 2018 5:54 pm 
Offline

Joined: Thu Feb 10, 2011 3:14 am
Posts: 79
Druzyek wrote:
This sounds really neat! Could you see how compiled code compares when you do loops and calculations? How about something along these lines:
Code:
int i,j,result;
for (i=0;i<10;i++)
{
   for (j=0;j<10;j++)
   {
      result=(i*10+j)*9/5+32;
      printf("%d\n",result);
   }
}

Actually, this is where C02 diverges from regular C. Because C02 is designed to compile directly to 6502 assembly, variables are limited to a single byte and mathematical operators are limited to available machine operations. And, because of the limits of addressing modes, the stack is not used during mathematical operations. Because of this multipluy and divide aren't available as normal operators, and functions may only appear as the first term of an argument, however functions may be nested.

Basically, the parser converts the expression into an LDA or JSR(s) followed by a series of ADC, SBC, AND, ORA, and EOR instructions.

To profile the above code, I will need to make the numbers a bit smaller so that interim results will fit in the range 0-255, and refactor the expression to work within the limits of C02. The parameters to printf are also reversed because of C02's calling convention for functions. In addition, escapes in strings are handled literally, so '\n' will not produce a newline on most 6502 systems. The current solution is to call the system-specific newlin() function.

So this is what the C02 code ends up looking like
Code:
char i,j,result;
for (i=0;i<5;i++) {
   for (j=0;j<5;j++) {
      result=div(mult(mult(i,5)+j,3),5)+32; //(i*5+j)*3/5+32;
      printf(result,"%d");
      newlin();
   }
}

and the equivalent C code is now
Code:
char i,j,result;
for (i=0;i<5;i++) {
   for (j=0;j<5;j++) {
      result=(i*5+j)*3/5+32;
      printf("%d\n",result);
   }
}

The resulting assembly will be in the following reply.


Top
 Profile  
Reply with quote  
PostPosted: Sun Feb 04, 2018 6:05 pm 
Offline

Joined: Thu Feb 10, 2011 3:14 am
Posts: 79
For the loops, C02 produces the code
Code:
        LDA #$00         ;MAIN: FOR (I=0
        STA I            ;;
L_0000: LDA I            ;I
        CMP #$05         ;<5
        BCC L_0003       
        JMP L_0001       ;;
L_0002: INC I            ;I++)
        JMP L_0000       
L_0003: LDA #$00         ;{ FOR (J=0
        STA J            ;;
L_0004: LDA J            ;J
        CMP #$05         ;<5
        BCC L_0007       
        JMP L_0005       ;;
L_0006: INC J            ;J++)
        JMP L_0004       
L_0007: //body of for loops
L_0005: JMP L_0002       ;}
L_0001: //continue code

and CC65 produces
Code:
   ldx     #$00
   lda     #$00
   sta     _i
L0002:   ldx     #$00
   lda     _i
   cmp     #$05
   jsr     boolult
   jne     L0005
   jmp     L0003
L0005:   ldx     #$00
   lda     #$00
   sta     _j
L000A:   ldx     #$00
   lda     _j
   cmp     #$05
   jsr     boolult
   jne     L000D
   jmp     L0004
L000D: //body of for loops
L0004:   ldx     #$00
   lda     _i
   inc     _i
   jmp     L0002
L0003:   ldx     #$00
   lda     #$00
   jmp     L0001
L0001: //continue with program


Top
 Profile  
Reply with quote  
PostPosted: Sun Feb 04, 2018 6:13 pm 
Offline

Joined: Thu Feb 10, 2011 3:14 am
Posts: 79
For the expression evaluation and subsequent assignment, we get the following:

From C02 (28 bytes)
Code:
        LDA I            ;{ RESULT=DIV(MULT(MULT(I
        LDY #$05         ;,5
        JSR MULT         ;)
        CLC              ;+J
        ADC J           
        LDY #$03         ;,3
        JSR MULT         ;)
        LDY #$05         ;,5
        JSR DIV          ;)
        CLC              ;+32
        ADC #$20         
        STA RESULT       ;;

and from CC65 (40 bytes)
Code:
ldx     #$00
   lda     _i
   jsr     mulax5
   jsr     pushax
   ldx     #$00
   lda     _j
   jsr     tosaddax
   jsr     mulax3
   jsr     pushax
   ldx     #$00
   lda     #$05
   jsr     tosudivax
   ldy     #$20
   jsr     incaxy
   ldx     #$00
   sta     _result

As you can see, C02 translates the code to 6502 operations as closely as possible, while CC65 calls a subroutine for every operation, and it promotes the values to 16-bits then casts them back to 8-bits.


Top
 Profile  
Reply with quote  
PostPosted: Sun Feb 04, 2018 6:16 pm 
Offline

Joined: Thu Feb 10, 2011 3:14 am
Posts: 79
Finally, there is the printing code:

C02 produces
Code:
        LDA RESULT       ;PRINTF(RESULT
        LDY #>L_0008     ;,"%d"
        LDX #<L_0008     
        JSR PRINTF       ;);
        JSR NEWLIN       ;NEWLIN();

L_0008: DC  $25,$64,$00 


and CC65 produces
Code:
   lda     #<(L0016)
   ldx     #>(L0016)
   jsr     pushax
   ldx     #$00
   lda     _result
   jsr     pushax
   ldy     #$04
   jsr     _printf

L0016:
   .byte   $25,$64,$0A,$00


Top
 Profile  
Reply with quote  
PostPosted: Sun Feb 04, 2018 8:33 pm 
Offline

Joined: Sat Jun 04, 2016 10:22 pm
Posts: 483
Location: Australia
I'm no expert, but that looks like C02 is a winner, certainly for code density.
One thing that I noticed is that there are no shift or rotate commands in C02. Is this because there is no way to shift multiple bits in one 6502 instruction?

EDIT: I just checked the docs again, and I see that there are bit-shift instructions. No rotates, but that does make a sense now that I think about it, since the 6502 rotates through the carry bit.
I still think I'd like to use C02 in my project, especially if it runs on the 'C02 as well.

Thinking of which, the name might be a touch awkward, due to the use of 'C02(with the apostrophe) to refer to the 65C02, at least around here). I'm not saying you have to change the name(and I think it's a good one), but might "C-02" work? I think it would differentiate it from the local slang a little more clearly.


Top
 Profile  
Reply with quote  
PostPosted: Mon Feb 05, 2018 1:47 am 
Offline

Joined: Thu Feb 10, 2011 3:14 am
Posts: 79
DerTrueForce wrote:
I still think I'd like to use C02 in my project, especially if it runs on the 'C02 as well.

The code generated is generic 6502 code with no illegal opcodes, or decimal mode, so it should run on any 6502 derivative. I plan to add a switch to generate 65C02 optimized code (particularly the BRA instruction).

As long as you don't need 16-bit math, it should work pretty well.

DerTrueForce wrote:
Thinking of which, the name might be a touch awkward, due to the use of 'C02(with the apostrophe) to refer to the 65C02, at least around here). I'm not saying you have to change the name(and I think it's a good one), but might "C-02" work? I think it would differentiate it from the local slang a little more clearly.

The name was intended to be a pun of CO₂ (Carbon Dioxide), and I was also considering a variant for the 8080/Z80 called C80.

At the very least I should rename the compiler cc02, to match cc, gcc, tcc, etc...


Top
 Profile  
Reply with quote  
PostPosted: Mon Feb 05, 2018 1:56 am 
Offline

Joined: Thu Feb 10, 2011 3:14 am
Posts: 79
DerTrueForce wrote:
One thing that I noticed is that there are no shift or rotate commands in C02. Is this because there is no way to shift multiple bits in one 6502 instruction?

EDIT: I just checked the docs again, and I see that there are bit-shift instructions. No rotates, but that does make a sense now that I think about it, since the 6502 rotates through the carry bit.
I still think I'd like to use C02 in my project, especially if it runs on the 'C02 as well.


There are multiple bit shift instructions in stdlib, which replicate the << and >> operators of C. Functions to do rolling are also possible, depending on what you want to do.

The intention of the language is that the fiddly parts will be written in assembler and called as functions. And the general program flow and overall logic written in C02.


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 37 posts ]  Go to page 1, 2, 3  Next

All times are UTC


Who is online

Users browsing this forum: No registered users and 23 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: