6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Sat Nov 23, 2024 8:32 pm

All times are UTC




Post new topic Reply to topic  [ 27 posts ]  Go to page 1, 2  Next
Author Message
PostPosted: Fri Nov 16, 2018 9:54 pm 
Offline

Joined: Fri Nov 16, 2018 8:55 pm
Posts: 71
I'm new here and glad to connect with other 6502 enthusiasts. I've long wanted to learn 6502 assembly and finally decided to "just start." I want to learn x86 assembly, too, at some future point. I expect 6502 assembly will be a lot easier to master given its much smaller instruction set, so I'm starting here as "primer" to learn how to learn other assembly languages. Besides, all of my favorite systems from my childhood ran a 6502.

On that note, I would very much love to get good enough at this to engage in reverse engineering and "source code archaeology" of some of my favorite C64 games to better understand how they work. That's entirely aspariational, right now I'm struggling with "Hello World."

I found a working example of "Hello World" for the C64 and decided to change it and "make it my own." After struggling with several possible assemblers starting with ca65 I eventually settled on 64tass, though ACME was a very close second.

I started by taking that GitHub snippet and determining to understand every part of every line of code. So far, I've done fairly well. After consulting a C64 memory map, I replaced some raw values with human friendly constant names and made sure the character position code could be easily incremented; originally it just skipped a location in lieu of encoding a space character.

With all of that said, here are my questions:

  • How do I get tass to simulate a loop for sta VICSCN+n until I run out of characters in "Hello World" using my current code as a starting point?
  • In this version of the code, why am I (confusingly) stuck with using screen codes instead of PETSCII character codes? That fact really threw me for a bit.
  • When I set *=$0810 my code "just works." A lot of example code I see online uses *=$1000 instead which never seems to work as intended when assembled with tass. Why might that be? I'll test again, but I'm 99% sure I changed the SYS call to the correct memory location when I attempted to move the *= (program counter?) to 1000.

Also, please feel free to leave any (constructive) comments that seem relevant. I'm here to learn. I want to write good ASM code.

Code:
; 64TASS
; hello.asm

KERNAL_CLEAR_SCREEN = $e544 ; KERNAL ROM routine.
VICSCN = $0400              ; VIC-II Screen Video Matrix, 1024 (int).


; BASIC loader.
*=$0801             ; The two byte load address at the start of the .PRG file.
    .byte $0b, $08  ; Linked list pointer to next line of BASIC.
    .byte $d9, $07  ; 2009 (int) line number (LO, HI).
    .byte $9e       ; BASIC SYS token.
    .text "2064"    ; Memory address (int) to start of ASM: $0810


; ASM code.
*=$0810 ; The start of ASM execution.

      jsr KERNAL_CLEAR_SCREEN

      ; Enter HELLO WORLD into screen memory ($0400-$07e7) (1024-2023).
      lda #8         ; 'H' Screen Code
      sta VICSCN + 0 ; 'H' in $0400 screen memory
            
      lda #5         ; 'E' Screen Code
      sta VICSCN + 1 ; 'E' in $0400+1 screen memory

      lda #12         ; 'L' Screen Code
      sta VICSCN + 2 ; 'L' in $0400+2 screen memory
      
      lda #12         ; 'L' Screen Code
      sta VICSCN + 3 ; 'L' in $0400+3 screen memory

      lda #15         ; 'O' Screen Code
      sta VICSCN + 4 ; 'O' in $0400+4 screen memory

      lda #32        ; ' ' Screen Code
      sta VICSCN + 5 ; ' ' in $0400+5 screen memory

      lda #23         ; 'W' Screen Code
      sta VICSCN + 6 ; 'W' in $0400+6 screen memory
      
      lda #15         ; 'O' Screen Code
      sta VICSCN + 7 ; 'O' in $0400+7 screen memory
      
      lda #18         ; 'R' Screen Code
      sta VICSCN + 8 ; 'R' in $0400+8 screen memory
      
      lda #12         ; 'L' Screen Code
      sta VICSCN + 9 ; 'L' in $0400+9 screen memory
      
      lda #4         ; 'D' Screen Code
      sta VICSCN + 10; 'D' in $0400+10 screen memory
      
      rts            ; All programs must end with Return To Subroutine (RTS).



Top
 Profile  
Reply with quote  
PostPosted: Fri Nov 16, 2018 10:27 pm 
Offline

Joined: Thu Jan 21, 2016 7:33 pm
Posts: 282
Location: Placerville, CA
load81 wrote:
Also, please feel free to leave any (constructive) comments that seem relevant. I'm here to learn. I want to write good ASM code.

A laudable goal - unfortunately, whoever wrote that example you found apparently didn't share it :/ You're absolutely correct in figuring that a loop would be a less silly way of accomplishing this.

Fortunately, the 6502 has features enough that it's relatively easy to write a loop that indexes across a block of memory. Specifically, the X and Y registers are primarily intended as index registers - so you can, say, LDA string-address, X and STA screen-address, X which would be roughly equivalent to screen[x] = string[x] in a high-level language. Then, simply by incrementing the X register, you're ready to move on to the next character.

The other important question is how you want to terminate the loop - and, by extension, how you want to terminate the string. There's two main ways to do that, C-style strings (which end with a zero byte) and Pascal-style strings (which begin with a string-length count before the actual text.) Both are valid strategies with their own advantages and disadvantages (C-string code is vulnerable to improperly-formed strings where the zero terminator is missing, while Pascal strings are limited in length by the size of the length variable - i.e. if the length is stored in one byte, the string can only be up to 255 characters long.)

Both are relatively easy to code on the 6502, because most operations set the Z flag if the result is zero, and conditional-branch instructions are available to jump someplace if the Z flag is set (BEQ) or if it's clear (BNE.) So you can easily detect the end of a C string right after loading the next string byte (loading the terminator will set the Z flag,) or you can do a Pascal string by loading the length value into one of the index registers (or a memory location, though this will be slower) and decrementing it after every pass through the loop (the final decrement will set the Z flag.)

I'll leave the actual implementation as an exercise, but that's the basics of it.

P.S. the reason the example uses screen codes instead of PETSCII is because it's writing directly to the C64 screen memory, and the character ROM the VIC-II uses is laid out in a different order than the PETSCII map, for some peculiar reason. The C64 Kernal routines for writing to the screen normally do the translation automatically, but if you're just dropping values into the tilemap, you need to account for that bit yourself.

P.P.S. the reason the example loads at $0801 is because this is the start of the BASIC program area on the C64, and the bit at the start of the program is a BASIC stub program that launches the actual machine-language program when you load and run it. If you put that data at $1000 instead, BASIC won't find it and the stub won't work. There's no particular reason to load at $1000 anyway other than that it's a nice round number in hexadecimal.


Top
 Profile  
Reply with quote  
PostPosted: Sat Nov 17, 2018 5:39 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10986
Location: England
Welcome load81!

As noted, your hello world starting point is aiming to be as simple as possible as a first example. A second example would surely use a loop, and within the loop some form of index addressing. Have a look around for other 6502 code examples: you don't need to copy them but you can still learn from their techniques. The easy6502 site is one place to look.


Top
 Profile  
Reply with quote  
PostPosted: Sat Nov 17, 2018 6:35 pm 
Offline

Joined: Fri Nov 16, 2018 8:55 pm
Posts: 71
Thanks for the warm welcome, both of you.

I eventually figured out why repointing the program counter for the start of assembly execution was failing. I updated the program counter with the *= label right before my ASM code, but failed to update the memory location targeted by the SYS token! Oops. So, repointing the PC to $1000 (4096) failed as a result.

I figured out my mistake, it's working now. I also learned to save a few bytes by rolling back the PC to $080d (2061) as that saves a few bytes and still leaves the three bytes required for BASIC to know it has reached EOF.

Experimenting in the other direction taught me that the PC itterating over empty bytes isn't "free." A noticeable delay can result if the start of your code is far from the start of BASIC, assuming a SYS loader.

I'll rewrite my "hello world" code as a loop and post it here -- thanks commodorejohn for pointing me in the right direction without simply pasting working code in his reply.


Top
 Profile  
Reply with quote  
PostPosted: Sat Nov 17, 2018 7:08 pm 
Offline

Joined: Mon May 21, 2018 8:09 pm
Posts: 1462
In machine code, there is no such thing as an "empty byte". A zero, for example, is a BRK instruction which causes a software interrupt; I have no idea what that does by default on a C64. Some other values will actually cause an NMOS 6502 to lock up and stop executing until reset.

If you really need a "landing zone" with benign behaviour, fill it with $EA bytes, which are NOP instructions that do nothing in 2 cycles each.


Top
 Profile  
Reply with quote  
PostPosted: Sat Nov 17, 2018 7:16 pm 
Offline

Joined: Fri Nov 16, 2018 8:55 pm
Posts: 71
Chromatix wrote:
In machine code, there is no such thing as an "empty byte". A zero, for example, is a BRK instruction which causes a software interrupt...


So I noticed when I disassembled my own code and got a bunch of BRK instructions whenever I encountered $00. I wrote some code, assembled it, and disassembled it by hand working from just a hex and an opcode table. I may be at the "hello world" stage, but I really want to get good at this and understand what is going on.

Now that you mention it, I was kind of surprised that $00 wasn't NOP...


Top
 Profile  
Reply with quote  
PostPosted: Sat Nov 17, 2018 8:42 pm 
Offline

Joined: Thu Jan 21, 2016 7:33 pm
Posts: 282
Location: Placerville, CA
load81 wrote:
Now that you mention it, I was kind of surprised that $00 wasn't NOP...

Yeah, you'd think. I suppose the logic is that if the PC ever ends up pointing to a region of zeroed-out memory, you want it to trap to the BRK handler on the theory that that's somewhere it shouldn't be executing from.


Top
 Profile  
Reply with quote  
PostPosted: Sat Nov 17, 2018 8:51 pm 
Offline
User avatar

Joined: Sun Jun 30, 2013 10:26 pm
Posts: 1949
Location: Sacramento, CA, USA
rosettacode.org has a few nice examples, including:

https://rosettacode.org/wiki/Hello_worl ... 2_Assembly

Maybe when you gain some experience, you can contribute there as well.

Happy programming!

_________________
Got a kilobyte lying fallow in your 65xx's memory map? Sprinkle some VTL02C on it and see how it grows on you!

Mike B. (about me) (learning how to github)


Top
 Profile  
Reply with quote  
PostPosted: Mon Nov 19, 2018 9:11 pm 
Offline

Joined: Fri May 05, 2017 9:27 pm
Posts: 895
Chromatix wrote:
In machine code, there is no such thing as an "empty byte". A zero, for example, is a BRK instruction which causes a software interrupt; I have no idea what that does by default on a C64.

The default system and interupt vectors are restored and the I/O devices initialized then the I/O channels are cleared. BASIC'S warm start routine is run and the READY prompt displayed.


Top
 Profile  
Reply with quote  
PostPosted: Mon Nov 19, 2018 10:25 pm 
Offline

Joined: Fri Nov 16, 2018 8:55 pm
Posts: 71
Oh, that makes sense. If the application BRK's, dump the user back to BASIC by default.


Top
 Profile  
Reply with quote  
PostPosted: Tue Nov 20, 2018 5:28 pm 
Offline

Joined: Fri Nov 16, 2018 8:55 pm
Posts: 71
After working through the first two chapters of Machine Language for the Commodore 64 and Other Computers I managed to workout some code that I'm reasonably happy with. My only complaint is that I was going for Pascal-style strings but I wasn't able to get my code to count down to zero with dex as expected, so I counted up and it "just worked." I'm sure I'll figure out my decrementing to zero mistake soon enough. "Despise not humble beginnings," I suppose.

Code:
;hello_world.asm
;64tass assembler

CHROUT        = $ffd2 ; C64 ROM routine, outputs a single PETSCII character.
BASIC_LOADER  = $0801 ; The start of BASIC
ASM_CODE      = $080d ; The start of ASM code to be executed.


*=BASIC_LOADER
    .byte $0b, $08   ; Linked list pointer to next line of BASIC.
    .byte $0a, $00   ; Line number 10 stored as (LO, HI) bytes.
    .byte $9e        ; BASIC SYS token.
    .text "2061"     ; The location of ASM as an integer stored as a string.
    .byte $0, $0, $0 ; BASIC interpreter EOF sequence.


*=ASM_CODE     
    ldx #$00         ; Register X holds the counter
    lda $081b,x      ; Accumulator pointer to string location, index by X.
    jsr CHROUT       ; $ffd2
    inx              ; Register X++
    cpx #$0d         ; Compare X the length of the string.
    bne $080f        ; WHILE the contents of $080f !=$00, loop.
    rts              ; Return to BASIC interpreter.

    .edef "{cls}", $93         ; Define "Clear Screen" escape sequence.
    .edef "\n", $0d            ; Define a C-syle EOL escape sequence.
    .text "{cls}HELLO WORLD\n" ; "Hello World" string.



Last edited by load81 on Tue Nov 20, 2018 7:54 pm, edited 3 times in total.

Top
 Profile  
Reply with quote  
PostPosted: Tue Nov 20, 2018 5:33 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10986
Location: England
It would be better (safer and more readable) to make a label for your branch target.

Please do share your non-working dex code. It will surely be illuminating for you and for everyone to see what you missed. Learning how to teach must include learning about what stumbling blocks exist.


Top
 Profile  
Reply with quote  
PostPosted: Tue Nov 20, 2018 9:56 pm 
Offline
User avatar

Joined: Thu May 28, 2009 9:46 pm
Posts: 8513
Location: Midwestern USA
load81 wrote:
My only complaint is that I was going for Pascal-style strings but I wasn't able to get my code to count down to zero with dex as expected, so I counted up and it "just worked."

Null-terminated strings are generally more flexible than Pascal-style ones, as the former don't require a comparison during each loop iteration. Your print subroutine will "know" when the end-of-string has been encountered simply because loading the null ($00) into the accumulator will set the Z flag in the status register. Hence your code will be similar to the following:

Code:
         ldy #0                ;starting index
;
loop0010 lda string,y          ;read from string
         beq eos               ;end-of-string null encountered
;
         jsr bsout             ;($FFD2) output byte
         iny                   ;bump index &...
         bne loop0010          ;repeat
;
eos      rts                   ;return to caller
;
;
;null-terminated character string to print...
;
string   .byte "Blah, blah, blah",$00

To make the above more general in nature and also capable of handling strings longer than 255 bytes, you should set up a zero page pointer so any character string can be printed. For example:

Code:
;improved string printing function
;
;   calling syntax:
;
;            ldx #<string  ;string address LSB
;            ldy #>string  ;string address MSB
;            jsr sprint    ;print string
;
;   registers used: .A & .Y
;
sprint   stx zpptr             ;set pointer LSB
         sty zpptr+1           ;set pointer MSB
         ldy #0                ;starting index
;
loop0010 lda (zpptr),y         ;read from string
         beq eos               ;end-of-string encountered
;
         jsr bsout             ;($FFD2) output byte
         iny                   ;bump index &...
         bne loop0010          ;repeat
;
         inc zpptr+1           ;bump MSB & ...
         bne loop0010          ;repeat
;
eos      rts                   ;return to caller
;
;
;null-terminated character string to print...
;
string   .byte "Blah, blah, blah",$00

The above is similar to the sprint function I use in my programs. String length may be up to 65,535 bytes, although such a long string is not practical. :D

BigEd wrote:
It would be better (safer and more readable) to make a label for your branch target.

Ed is 100 percent correct. Good programming style assigns labels to all targets (branch, jump, etc.) and also avoids the use of "magic" numbers in the body of the code, e.g., cpx #$0d. Instead, such constants should be assigned to symbols, such as a_cr = $0d (assigns the ASCII value of a carriage return to the symbol a_cr).

_________________
x86?  We ain't got no x86.  We don't NEED no stinking x86!


Top
 Profile  
Reply with quote  
PostPosted: Tue Nov 20, 2018 10:23 pm 
Offline

Joined: Fri Nov 16, 2018 8:55 pm
Posts: 71
Wow, you're all super helpful.

Someone asked for my non-working code. Here it is.

Code:
*=ASM_CODE     
    ldx #$0d         
    lda $081b,x     
    jsr CHROUT       
    dex              ; Decrement register
    cpx #$00         ; Compare X with $00
    bne STRING       ; WHILE the contents of $080f !=$00, loop.
    rts             

    .edef "{cls}", $93         
    .edef "\n", $0d           
    .text "{cls}HELLO WORLD\n" ; "Hello World" string.


What I end up getting with this code is my string printed backwards. My actual intent was to print my string in the correct direction but decrement the counter so that the string terminates when the counter is equal to 0. Chapter 2 of Jim Butterfield's book specifically calls out dexfor this purpose, but obviously I'm doing something wrong here.

As for turning all of my raw memory values into constants, noted. Other than using the name of the sys call (in the case of calling a KERNAL routine) is there some sort of naming convention that an informal standard for this kind of thing? I've tried reading ASM sources of some other projects and the constant and macro names are frequently rather cryptic.


Top
 Profile  
Reply with quote  
PostPosted: Tue Nov 20, 2018 10:35 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10986
Location: England
Thanks for the code! Yes indeed, if you count down using the index as a counter, you'll be reading off the string backwards. You could workaround that by defining your strings backwards in memory. I'm not sure what's usual. Maybe it's usual to count upwards for strings, but downward for general copies and multi-byte actions.

It's worth noting that your comparison with zero isn't needed: it's a common thing for a newcomer to 6502 to do, but once you've got a better handle on the way the status bits are set, you'll be confident that the Z flag will already be set by the DEX.

As for what to call your variables and labels, that's a bit of an art. You want something short and descriptive. For example, at the top of a loop which acts on one character at a time, I might use the label NEXTCHAR or similar. Maybe ONECHAR.

(The old joke is that there are two hard things in computer science: cache invalidation, naming things, and off-by-one errors.)


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 27 posts ]  Go to page 1, 2  Next

All times are UTC


Who is online

Users browsing this forum: No registered users and 53 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: