6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Sun Sep 22, 2024 7:24 pm

All times are UTC




Post new topic Reply to topic  [ 17 posts ]  Go to page 1, 2  Next
Author Message
PostPosted: Wed Aug 12, 2020 2:50 am 
Offline

Joined: Wed Aug 12, 2020 2:30 am
Posts: 43
First off, I'm a professional C++/C# developer. So I guess I'm spoiled with high level languages.

I have the following super simple program to draw bytes (made from CBM prg Studio) to screen memory.
After hours of searching and reading I'm no closer to the solution. I need to count to 1000 to keep up with
how many bytes I've written. I know I have to use high/low byte formats but I don't have a clue how.

Any help please?

Code:
*=$c000
          ldx     #$00
          jsr     $e544 ; Clear the screen
LOOP
          lda     SCREEN1,x
          sta     $0400,x
          inx
          ;cpx     #$??? I need to count 1000 bytes here.
          ;bne     loop
          rts
SCREEN1
          BYTE    $66,$66,$66,$66,$66,$66,$66,$66,$66,$66,$66,$66,$66,$66,$66,$66,$66,$66,$66,$66,$66,$66,$66,$66,$66,$66,$66,$66,$66,$66,$66,$66,$66,$66,$66,$66,$66,$66,$66,$66
          BYTE    $66,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$60,$60,$60,$66
          BYTE    $66,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$66
          BYTE    $66,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$66
          BYTE    $66,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$66
          BYTE    $66,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$66
          BYTE    $66,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$66
          BYTE    $66,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$66
          BYTE    $66,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$66
          BYTE    $66,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$66
          BYTE    $66,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$66
          BYTE    $66,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$66
          BYTE    $66,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$66
          BYTE    $66,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$66
          BYTE    $66,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$66
          BYTE    $66,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$66
          BYTE    $66,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$66
          BYTE    $66,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$66
          BYTE    $66,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$66
          BYTE    $66,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$66
          BYTE    $66,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$66
          BYTE    $66,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$66
          BYTE    $66,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$66
          BYTE    $66,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$20,$66
          BYTE    $66,$66,$66,$66,$66,$66,$66,$66,$66,$66,$66,$66,$66,$66,$66,$66,$66,$66,$66,$66,$66,$66,$66,$66,$66,$66,$66,$66,$66,$66,$66,$66,$66,$66,$66,$66,$66,$66,$66,$66


Top
 Profile  
Reply with quote  
PostPosted: Wed Aug 12, 2020 3:40 am 
Offline
User avatar

Joined: Fri Aug 30, 2002 1:09 am
Posts: 8511
Location: Southern California
Welcome.

I'm not very knowledgeable in C64, nor am I a games person; but since the index registers are only 8 bits wide, with possible contents ranging from 0 to 255, and since 1000 can be divided evenly by four, how about using only the one byte for counting and only running the loop 250 times, for speed. So you'd have something like:

Code:
     LDX     #250
loop:   LDA  SCREEN1-1,     X
        STA  $400-1,        X

        LDA  SCREEN1+250-1, X
        STA  $400+250-1,    X

        LDA  SCREEN1+500-1, X
        STA  $400+500-1,    X

        LDA  SCREEN1+750-1, X
        STA  $400+750-1,    X

        DEX
     BNE     loop

That way you don't have to deal with a high count byte. The "-1" is so we can count down and use the CPX#0 that's an automatic, implied, built-in part of DEX, so we'll do X values of 250 down to 1, ie, 250 different values, and when it gets decremented to 0, we drop through instead of looping one more time with value=0. So since the last one is a value of 1, we'll subtract 1 in the operands.

On second thought, since there are long stretches of the same value, you could further speed it up by running loops that don't re-load the accumulator between STA's. (I don't know what your priority is between speed and memory taken.)

_________________
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?


Top
 Profile  
Reply with quote  
PostPosted: Wed Aug 12, 2020 3:51 am 
Offline

Joined: Sat Jun 04, 2016 10:22 pm
Posts: 483
Location: Australia
That's a good one, Garth. I hadn't thought of that.
Another way would be to do pointer arithmetic, using 16-bit increment and compare operations, and indirection instead of indexing.
The 65-series use little-endian format (low address = low byte) for multi-byte values.


Top
 Profile  
Reply with quote  
PostPosted: Wed Aug 12, 2020 7:16 am 
Offline

Joined: Tue Sep 03, 2002 12:58 pm
Posts: 325
A more literal version of what you were attempting is this:
Code:
   LDA #$00
   STA COUNT
   STA COUNT+1
   STA ADDR
   LDA #$04
   STA ADDR+1

   LDY #0
LOOP1
   LDA SCREEN1,Y
   STA (ADDR),Y

   INY
   BNE SKIP
   INC ADDR+1
SKIP
   INC COUNT
   BNE SKIP2
   INC COUNT+1
SKIP2

   LDA COUNT
   CMP #$E8
   BNE LOOP
   LDA COUNT+1
   CMP #3
   BNE LOOP

(not tested, probably full of errors)

Since the 6502 is limited to working with 8 bits at a time, anything requiring more than 8 bits has to be split into multiple bytes. As you can see, this leads to some verbose and slow code.

More experienced 6502 coders recognise special cases where short-cuts can be taken. Garth's code is an example of this. No one would write my version: they'd write his.


Top
 Profile  
Reply with quote  
PostPosted: Wed Aug 12, 2020 7:26 am 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10938
Location: England
Welcome Daniel! To the forum and the world of 6502 programming.

As an overview of the situation: you'd like to count, and you'd like to write to the corresponding address. That's two different things. If you happened to be accessing 256 bytes or fewer, you could combine the two actions seamlessly - more or less as you have, using X as both the counter and as an index register.

But as you note, your problem is bigger than 256. So, the simplest way to proceed is to have, on the one hand, a counter with more range, and on the other hand, to form addresses over a greater span. There are several ways to do both of those.

After the simplest approach is working, you can consider ways to combine the two functions. Some approaches will be smaller, some faster, and some easier to understand. There are usually many ways to tackle a problem. I'd recommend starting with simple approaches which you understand: the clever tricks, combinations of methods, and efficiencies can come later. Pretty much everyone programming at this low level can say that there's more they can learn.


Top
 Profile  
Reply with quote  
PostPosted: Wed Aug 12, 2020 8:19 am 
Offline
User avatar

Joined: Sun Jun 30, 2013 10:26 pm
Posts: 1948
Location: Sacramento, CA, USA
John West wrote:
(not tested, probably full of errors)

Yeah, I think you're only accessing the first 256 bytes of the SCREEN1 table. For this exact use case I think it's going to be hard to beat Garth's code for efficiency (on a 6510) because initializing and maintaining two indirect pointers is more work, and even self-modifying code doesn't seem like it would be a net win.

_________________
Got a kilobyte lying fallow in your 65xx's memory map? Sprinkle some VTL02C on it and see how it grows on you!

Mike B. (about me) (learning how to github)


Top
 Profile  
Reply with quote  
PostPosted: Wed Aug 12, 2020 8:44 am 
Offline
User avatar

Joined: Fri Aug 30, 2002 1:09 am
Posts: 8511
Location: Southern California
DanielS wrote:
So I guess I'm spoiled with high-level languages.

You can raise the level of assembly language quite a lot with macros. I don't recommend macros to totally new programmers, but you're probably well enough acquainted with the idea in other languages to jump in sooner than others might. Keep the URL http://wilsonminesco.com/StructureMacros/ marked for such a time. After an introduction to macros, it specifically gets into using macros to form program flow-control structures in 6502 assembly language. Out of all the ones I provide, the only two that are not nestable are CASE (like C's "switch...case") and the 16-bit FOR...NEXT shown below. They can be nested with structures of other types, but not with others of the same type like IF...ELSE...ENDIF, BEGIN...UNTIL, BEGIN...WHILE...REPEAT, BEGIN...AGAIN, FOR_X...NEXT_X, and FOR_Y...NEXT_Y can. The last 40% of the page on simple multitasking methods at http://wilsonminesco.com/multitask/ shows some extended examples of source code from my work using them; and the program-structures page of the 6502 treatise on stacks (plural, not just the page-1 hardware stack) gets more into the internal workings of such macros, at http://wilsonminesco.com/stacks/pgmstruc.html .

The indirection mentioned by DerTrueForce above could look something like the following. The source code is shorter and more understandable, but performancewise my earlier post will be more efficient since it does not deal with 16-bit counters; and also indexing takes fewer cycles than indirects do, and incrementing or decrementing an index register takes fewer cycles than doing so to memory does.
Code:
        PUT2 $400, in, dest
        FOR  source, SCREEN1, to, SCREEN1+1000
             LDA  (source)
             STA  (dest)
             INC2 dest
        NEXT source
Macro "PUT2" lays down the pair of LDA-STA's to put a 16-bit literal in a two-byte variable, in this case ZP variable "dest" which will point to where you want to copy your array to. It could be written with conditional assembly to optimize it, like using STZ for 65c02 when a byte is 00, and only loading the accumulator once if the high and low bytes match. Macro "FOR" lays down the code to initialize ZP variable "source" to point to SCREEN1's 16-bit starting address. The assembler (not the target system) holds the 16-bit final value for the code laid down by macro "NEXT" to test against. That code laid down by NEXT increments 16-bit ZP variable "source" and tests it against the desired final value, and branches up to the LDA unless the increment resulted in a match. ZP variable "dest" does not get incremented by NEXT though, which is why there's a separate line for that increment. INC2 is another macro which lays down the INC<low_byte>, BNE, INC<high_byte>, just to shorten the source code while still laying down the same code you would do by hand. If you wanted to, you could even have a macro COPY_IND (copy indirect) to copy one address to another, and put the pointers' addresses in the macro invocation line, something like
Code:
        COPY_IND  source, dest
to lay down the LDA(source), STA(dest) code shown above, further shortening the source code. You would end up with only five lines total, and every one of them would be a macro invocation. It would not look like assembly language.

Edit: I was forgetting that the NMOS 6502 does not allow indirection without indexing like the CMOS 65c02 does. For NMOS, you'd have to put 0 in Y, and do LDA (source),Y, and STA (dest),Y.

Michael Barry's idea of the self-modifying code might actually be a good one (assuming the code is going into RAM). (Obviously it doesn't need to be re-entrant.) I'm going to bed though, so I won't try it right now.

_________________
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?


Top
 Profile  
Reply with quote  
PostPosted: Wed Aug 12, 2020 1:14 pm 
Offline

Joined: Wed Aug 12, 2020 2:30 am
Posts: 43
Thanks for the replies. I'll go through each.


Top
 Profile  
Reply with quote  
PostPosted: Wed Aug 12, 2020 1:21 pm 
Offline

Joined: Wed Aug 12, 2020 2:30 am
Posts: 43
My first computer, wayyy back many many moons ago, was a Tandy CoCo 3. By the time I got a C64 I was well versed in BASIC. A lot of games came with a small .bas file containing a single line. 10 sys49152 (that's an arbitrary number). I didn't know anything about machine language at the time so that single line of BASIC code fascinated me to no end. For several years I've wanted to dive into C64 assembly.


Top
 Profile  
Reply with quote  
PostPosted: Wed Aug 12, 2020 3:44 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10938
Location: England
It's a good journey. You might find the tutorial at easy6502 useful.


Top
 Profile  
Reply with quote  
PostPosted: Thu Aug 13, 2020 5:29 am 
Offline

Joined: Sun Nov 08, 2009 1:56 am
Posts: 395
Location: Minnesota
Quote:
You might find the tutorial at easy6502 useful.


I've actually been using that the last few days to fiddle with 16x16 bit division algorithms (okay, that's not what it's for - I repurposed it as being the easiest online resource I could find that had the capability I needed).

Be warned its assembler is pretty rudimentary, though. It doesn't understand arithmetic, so you cannot distinguish consecutive addresses by adding integers to a base address. Nor does it understand any pseudo opcode except "define", which at least lets you give labels to addresses. Labels used as program control branch targets are followed by a colon (":").


Top
 Profile  
Reply with quote  
PostPosted: Thu Aug 13, 2020 5:51 am 
Offline

Joined: Sun Nov 08, 2009 1:56 am
Posts: 395
Location: Minnesota
Quote:
I'm a professional C++/C# developer.


Well, then you're probably familiar with the concept of a pointer. You can put that knowledge to good use when working with a 6502 family processor. There are several "indirect" addressing modes in which you set up a 16-bit pointer somewhere in page zero of memory. That points to some address in the rest of the 64K address space. You can use that as a base pointer and then index off that with the Y register (most commonly; the way the X register works in indirect mode is different and not used very often).

So your program might look something like:

Code:
 lda #<screen1
 sta $10
 lda #>screen1
 sta $11              ; $10/11 points to source start
 lda #<$400
 sta $12
 lda #>$400
 sta $13              ; $12/13 points to destination start
 ldx #4
 ldy #0
loop:
 lda ($10),y
 sta ($12),y
 iny
 bne loop            ; copy 256 bytes
 inc $11              ; move high byte of source pointer up one page (+256 bytes)
 inc $13              ; same for destination pointer (so 16 bits go $400/$500/$600/$700 and one extra)
 dex                 
 bne loop            ; do four pages (1024 bytes)
 


You can of course adjust things so that you copy exactly 1000 bytes, no more, no less, but that's the basic idea.

As for speed, it's slower than Garrth's method. What's he's done is form of loop unrolling. But if you really lusted even more, you could unroll his loop even more. If you put 250 stores in it main body, you'd only have to do it four times (but if you did, it would be larger than the range a branch instruction could reach. So in practice you'd write it as a subroutine and call it four times in a row).


Top
 Profile  
Reply with quote  
PostPosted: Thu Aug 13, 2020 11:11 am 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10938
Location: England
Another possible way of experimenting and exploring is to use a C compiler and study the output. You can do that in your browser with Matt Godbolt's Compiler Explorer, in this case configured to use cc65:
https://godbolt.org/z/PsanY3

By and large, IMHO, a beginner should first of all strive to produce working code, preferably code that's written and commented such that it still makes sense a week or two later. In due course, after study and practice, your code will become more idiomatic, and then more efficient. (The output of a compiler, with today's tooling, is generally not going to be idiomatic or efficient - but it should be correct.)


Top
 Profile  
Reply with quote  
PostPosted: Thu Aug 13, 2020 2:36 pm 
Offline

Joined: Wed Aug 12, 2020 2:30 am
Posts: 43
Working code is what I'm trying for. Does anyone know where I can find some simple programs to study?

Something along the lines of...
Code:
10 input "Enter your name: ", a$
20 print "hello, ", a$


Something that would give me a solid foundation for outputting a string, getting input, and outputting that input.
Sorry for the "newbie" questions. This is something I'm really interested in.


Top
 Profile  
Reply with quote  
PostPosted: Thu Aug 13, 2020 3:28 pm 
Offline
User avatar

Joined: Sun Jun 30, 2013 10:26 pm
Posts: 1948
Location: Sacramento, CA, USA
teamtempest wrote:
As for speed, it's slower than Garrth's method. What's he's done is form of loop unrolling. But if you really lusted even more, you could unroll his loop even more. If you put 250 stores in it main body, you'd only have to do it four times (but if you did, it would be larger than the range a branch instruction could reach. So in practice you'd write it as a subroutine and call it four times in a row).

If I count correctly, Garth's total loop overhead is only 3251 cycles [Edit: plus a couple/few hundred more for page crossing, so call it less than 4000] (compared to coding 6KB of raw LDA abs STA abs pairs), which is not insignificant, but still very good for its size. The only way I can imagine improving on that is with a massive monolith that gobbles lots of code space (or switching to a 65c810, which doesn't seem possible). :-)

_________________
Got a kilobyte lying fallow in your 65xx's memory map? Sprinkle some VTL02C on it and see how it grows on you!

Mike B. (about me) (learning how to github)


Last edited by barrym95838 on Fri Aug 14, 2020 7:08 am, edited 4 times in total.

Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 17 posts ]  Go to page 1, 2  Next

All times are UTC


Who is online

Users browsing this forum: No registered users and 14 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: